Promise and perils: Machine learning applied to human rights practice

17 posts / 0 new
Last post
Promise and perils: Machine learning applied to human rights practice

Below is a list of questions to serve as a starting framework for the discussion in this thread:

    • Sharing examples stories of using ML in the field of human rights
    • When does it make sense to use ML in the field of human rights? And when it doesn’t?
    • What are the limitations and risks?
    • Which tools are available for human rights practitioners?

     

    Which tools are available for human rights practitioners?

    I'd be very interested in hearing what others have seen as Machine Learning applied to Human Rights. The 3 examples that come to mind in my case are:

    • The work of the Center for Human Rights Science (CHRS) at Carnegie Mellon University (where I am Program Manager).E-LAMP is a machine learning and computer vision–based video analysis system that is able to detect objects, sounds, speech, text, and event types (say, a news broadcast or a protest) in a video collection. In practice, this allows users to run semantic queries within video collections. See more at https://aladdin1.inf.cs.cmu.edu/human-rights/E-LAMP
    • The research at the University of Sheffield and the University of Pennsylvania where AI has been used to develop a method for accurately predicting the results of judicial decisions of the European Court of Human Rights. The research team identified 584 cases relating to three articles of the European Convention on Human Rights: Article 3, concerning torture and inhuman and degrading treatment; Article 6, which protects the right to a fair trial; and Article 8, on the right to respect for a private and family life. See more at http://www.ucl.ac.uk/news/news-articles/1016/241016-AI-predicts-outcomes...
    • The recently launched VFRAME (Visual Forensics and Metadata Extraction), a collection of open source computer vision tools designed specifically for human rights investigations that rely on large video datasets. https://vframe.io/

    What else is out there?

    How about the plans to integrate ML into UWAZI?

    Hey HURIDOCS colleagues, how about ML and UWAZI? We read that it is going in the direction of "automated classification of documents using machine learning" (https://www.huridocs.org/2018/05/starting-at-the-source-introducing-uwaz...)

    How about the plans to integrate ML into UWAZI?

    When working with huge collections of documents it requires tedious manual work to extract metadata and to categorize them. Therefore, at HURIDOCS we are working on a sentence classifier that is adaptable to the specific interest or research questions of human rights defenders. By highlighting sentences in a document a user teaches the algorithm which information is of relevance. Then the algorithm then other phrases with similar content which the user can accept or reject to further improve the algorithm. The goal is to combine human expert knowledge with machine intelligence to efficiently support human rights defenders when working with document collections.

    VFRAME

    VFRAME is a such an exciting tool! Adam, could you tell us more about how it works and how it supports the Syrian Archive in documenting war crimes?

    Members tagged in this comment: 
    VFRAME

    Hi everyone. Thanks for inviting me to join these discussions.

    I'm building VFRAME according to the specific needs of researchers at Syrian Archive. Their goal is to provide verified videos that can potentially be used as evidence of war crimes. Through our nearly daily discussions I've learned how computer vision can be applied to improve their workflow.

    They already have an impressively organized workflow for reviewing videos across multiple groups. What I found was needed most is way to prioritize videos in the dataset that might be useful for a current or future investigation. Currently there are over 1.500.000 videos. Manual review does not scale. And watching too many videos can cause vicarious trauma.

    The first goal with VFRAME was to develop object detection profiles that can be used to find munitions in the entire dataset. In order to do this I needed an annotation platform. And in order to create images for annotation I needed another algorithm to summarize videos into representative keyframes. So current we have three main parts: the actual object detection, a platform for annotating images, and a pipeline for converting images to keyframes for the annotation platform. Attached is a preview of the annotation web app.

    Getting back to the goal. The keyframes can then be fairly rapidly processed by the object detection algorithm to quantify the total number of cluster munitions in the entire dataset. This is what I'm working on now. Because 1.5M is so massive, it takes 2 months alone to process the videos into keyframes. But searching all videos (about 45 million keyfames) should only take a few days.

    Then this information is turned into a small web server. Ping the filename and get back all the info about the video as JSON. That's pretty much how VFRAME is being designed.

    Video --> VFRAME -->  vframe.io/project/filename --> JSON summarization

    (oh, and VCAT is just the name of the Visual Classification and Tagging system)

    This is very interesting Adam

    Comment originally posted by Enrique Piracés

    This is very interesting Adam! Have you seen what my colleagues at CMU's Center for Human Rights Science have done? I can imagine sinergies or at least some valuable information exchange. See https://aladdin1.inf.cs.cmu.edu/human-rights/E-LAMP

     

    Annotating Strategy

    Comment originally posted by Adam Harvey

    Hi Enrique, I'm very interested in the Aladdin project and saw this graphic on the project page. The type of video in this example looks very similar to the ones I'm analyzing. How accessible is the code for the Aladdin projects?

    I noticed in the PDFs you posted that one of the tasks was splitting the video into keyframes so they could be annotated. Same challenge here, which all revolves around data reduction. Of course watching all videos manually is not practical, but even processing videos can take a really long time, at least with a single workstation.

    One of the challenges within this task that I don't think my scene summarization solves yet is how to choose the best frame within each scene or sub-scene. There was a new post on http://learnopencv.com/ today about using BRISQUE (https://github.com/spmallick/learnopencv/tree/master/ImageMetrics) for image quality metrics. It appears to work well for isolating graphic/title sequences from the actual video and for finding the highest quality (sharpest, best histogram) image. Possibly this can be used to at least improve the keyframes selected for the dense/manual reviewer (max of 9-12 frames).

    For creating the annotation frames I was originally trying to use https://github.com/antingshen/BeaverDam, a video annotation Flask app from Berkeley but found that annotating videos was too much congitive overhead compared to annotating still frames.

    Curious what works and what doesn't work for E-LAMP annotating?

     

    UPR-SDG explorer

    This project by the Danish Institute for Human Rights incorporates a 'semi-supervised machine-learning setup':

    http://upr.humanrights.dk/methodology

    Interested to learn more about the promises and pitfalls of these projects!

    Thanks, this is quite

    Thanks, this is quite interesting. Do you know if this has been used in practice? It says that could be used ofr national implemenation, follow up or review, which makes sense, but has it ever been used?  

    The Danish Institute’s

    The Danish Institute’s database is to a large extent based on UPR Info’s (https://www.upr-info.org/database/), and the ML is used to predict the matching SDGs, which is interesting and useful, but, I would argue not the most relevant for follow up or review.

    The easy availability of the individual recommendations by state under review, recommending state, type of response and issues covered are very useful in practice. Civil society and NHRIs use it to prepare upcoming cycles, to identify potential allies as recommending states, and also for follow up.

    That this is available is not thanks to machine learning, however, but through a rather convoluted process to extract the individual recommendations (ca. 3000 per session) from long Word files, turn them into a spreadsheet for data cleaning and manually adding issues, and then uploading them on the website.

    Could Machine Learning be helpful to simplify this process and potentially even add additional information, including on SDGs? Absolutely. But the current use of the data is due to the hard and manual work of UPR Info, not because of an arguably interesting application of ML.

    Other examples of using ML in the field of HR

    Comment originally posted by Natalie Widmann

    Here are some more HR projects which use machine learning:
    - the Programa de Derechos Humanos at the Ibero-American University in Mexico City, Data Cívica and the Human Rights Data Analysis Group (HRDAG) uncovered mass graves of missing people in Mexico: https://qz.com/958375/machine-learning-is-being-used-to-uncover-the-mass...
    - Amnesty International uses sentiment analysis to track Twitter abuse against Women MPs (https://medium.com/@AmnestyInsights/unsocial-media-tracking-twitter-abus..., the method can be found here: https://drive.google.com/file/d/0B3bg_SJKE9GOenpaekZ4eXRBWk0/view)
    - In another project Amnesty International combines machine learning with crowd-coding to find evidence of destroyed homes and schools in Darfur's villages https://decoders.amnesty.org/projects/decode-the-difference
    - Also the United Nations Global Pulse's projects are very interesting: https://www.unglobalpulse.org/projects

    Machine learning in detecting fake videos (including deepfakes)

    Comment originally posted by Sam Gregory

    WITNESS just lead a convening in Silicon Valley of technologists, HRDs, journalists and researchers focused on proactive responses to the possibilities of 'synthetic media' and deepfakes to increase risks of falsified human rights evidence, 'digital wildfire' on human rights issues and micro-targeted hatespeech. Machine learning and AI, particularly in the form of generative adversarial networks (GAN) clearly plays a role in the creation of these deepfakes and synthetic media but can also play a role in detecting them. One recent example is the FaceForensics database (https://arxiv.org/abs/1803.09179), which uses a GAN to analyze and identify images created with deepfake and similar software; other work in this area tries to use machine learning and computer vision to do image phylogeny to see whether an image has been based on existing imagery/background of an image etc. Moving forward one topic of discussion in the meeting was how to sync up advances in automatic forensics based on ML with existing verification practices in human rights and journalism.

     

    Adversarial attacks to protect identity from facial detection

    Comment originally posted by Sam Gregory

    Another area to think about in terms of ML and human rights practice is how we mitigate anonymity/visibility risks for HRDs and other vulnerable practitioners. One key area here that preoccupies me (and I know Adam has worked on extensively) is the use of computer vision and ML in platforms and facial detection/recognition systems. An interesting experiment in using adversarial attacks to avoid scaled facial detection systems in a way that could be relevant for human rights practitioners sharing face images online is the work of EqualAIs, a project that came out of the Assembly Program at the Berkman-Klein Center at Harvard Law: http://equalais.media.mit.edu/

    They use the addition of an adversarial perturbation (not visible to the naked eye, but sufficient to confuse ML-based based classification) to confuse facial detection. Here's their description of an adversarial attack: "An adversarial attack is a perturbation deliberately applied to inputs in order to change classifier predictions while leaving the content easily recognizable to humans. Deep learning classifiers are known to be susceptible to adversarial attacks, with examples in previous work including fooling models into saying a stop sign isn’t there, a cat is guacamole or a toy turtle is a rifle. In our case, the classifiers are neural network models that have been trained to find faces in images. The adversarial attack is a perturbation of the images such that the faces in the perturbed images are still easily recognizable to humans but the model can no longer see them and reports with high confidence that the perturbed images contain no faces."

     

    One more example

    Comment originally posted by Vivian Ng

    Great to see all these examples of how machine learning has been applied to human rights work. Another example is UNHCR and the work of the Innovations team.

    (http://www.unhcr.org/innovation/experiments/) Some of my colleagues in the Human Rights, Big Data and Technology Project (https://www.hrbdt.ac.uk/) are working also on developing new approaches to human rights work using computational techniques, and machine learning has been a part of that work.

    One more example

    Comment originally posted by Vivian Ng

    Great to see all these examples of how machine learning has been applied to human rights work. Another example is UNHCR and the work of the Innovations team.

    (http://www.unhcr.org/innovation/experiments/) Some of my colleagues in the Human Rights, Big Data and Technology Project (https://www.hrbdt.ac.uk/) are working also on developing new approaches to human rights work using computational techniques, and machine learning has been a part of that work.

    CV for Visual Recognition of Human Rights Violations

    Hi everyone.

    Great to see all these examples of how ML has been applied to human rights advocacy.

    Another example is the UK Economic and Social Research Council funded research project ‘Human Rights, Big Data and Technology’, as mentioned by vivian.ng.

    In this project our team explores automated image-based technologies capable of identifying potential human rights abuses and considers the ways in which it can support human rights investigators. Our focus is on  computer vision-based techniques and image data for identifying potential abuses of human rights violations given a single image.

    You can check out some of our work at the following links:

    We are really interested in collaboration considering the overlap of interests found in this thread.