Back to All Events

Cross-modal Transfer Between Vision and Language for Protest Detection

Abstract

Multimodal data (data with two or more modalities like text, images, or audio) has gained more and more attention in the last year. One example is the image-generating model DALL·E 2, released by OpenAI, which uses the modalities of images and text. Even though multimodal models have proven to have great potential, most of today's systems for socio-political event detection are text-based.

In this presentation, we discuss a proposed approach of using the increasing amount of multimodal data to decrease the need for annotation - as presented in our paper "Cross-modal Transfer Between Vision and Language for Protest Detection." We propose a method that utilizes existing annotated unimodal data to perform zero-shot event detection in another data modality. Specifically, we focus on protest detection in text and images and show that a pretrained vision-and-language alignment model (CLIP) can be leveraged to this end. In particular, our results suggest that annotated protest text data can act supplementarily to detect protests in images, but significant transfer is also demonstrated in the opposite direction.

Ria Raj

Software Engineer @ Recorded Future

Ria has a background in automation, mechatronics, and machine learning. She works as a software engineer in the Threat Intelligence team at Recorded Future. Her interest in AI came from learning about natural language processing, and today she calls herself an enthusiast, both professionally and personally. She hopes to be a part of using ML and AI to make our world a safer place!

Kajsa Andreasson

Software Engineer @ Recorded Future

Kajsa is a software engineer in the Text Analytics team at Recorded Future with a big passion for natural language processing, artificial intelligence techniques, and how they can be used to make our world a better and safer place. She recently graduated from the master's programme Complex Adaptive Systems at Chalmers, and on a day-to-day basis, she loves implementing state-of-the-art methods most efficiently.