[visionlist] 1st Call-for-Participation & data release: MediaEval 2021 Predicting Video Memorability Task

Sat Sep 18 17:01:13 -04 2021

[Apologies for cross-postings]

*******************************************************
1st CALL FOR PARTICIPATION & DEVELOPMENT DATA RELEASE
Predicting Video Memorability Task
2021 MediaEval Benchmarking Initiative for Multimedia Evaluation
https://multimediaeval.github.io/editions/2021/tasks/memorability/
*******************************************************
Register to participate by filling in the MediaEval 2021 Registration
form: https://docs.google.com/forms/d/e/1FAIpQLSchIcIaSlM1fNeWGCSoSBMR6HS48HKMhWEY151vvCmb5KhO-w/viewform
*******************************************************
Annotations: https://annotator.uk/mediaeval/index.php
*******************************************************

The Predicting Video Memorability Task focuses on the problem of
predicting how memorable a video will be. It requires participants to
automatically predict memorability scores for videos, which reflect
the probability of a video being remembered.

Participants will be provided with an extensive dataset of videos with
memorability annotations, and pre-extracted state-of-the-art visual
features. The ground truth has been collected through recognition
tests, and, for this reason, reflects objective measures of memory
performance. In contrast to previous work on image memorability
prediction, where memorability was measured a few minutes after
memorisation, the dataset comes with short-term and long-term
memorability annotations. Because memories continue to evolve in
long-term memory, in particular during the first day following
memorisation, we expect long-term memorability annotations to be more
representative of long-term memory performance, which is used
preferably in numerous applications.

*******************************************************
Video-based prediction task
*******************************************************
Participants will be required to train computational models capable of
inferring video memorability from visual content. Optionally,
descriptive titles attached to the videos may be used. Models will be
evaluated through standard evaluation metrics used in ranking tasks.

*******************************************************
Generalization task (optional)
*******************************************************
The aim of the Generalization subtask is to check system performance
on other types of video data. Participants will use their systems,
trained on one of the two sources of data we propose, to predict the
memorability of videos from the testing set of the other source of
data. We believe this would provide interesting insights into the
performance of the developed systems, given that, while the two
sources of data measure memorability in a similar way, the videos may
be somewhat different with regards to their content, general subjects
or length. As this will be an optional task, participants are not
required to participate in it.

*******************************************************
Pilot demonstration task (pilot)
*******************************************************
The aim of the Memorability-EEG pilot task is to promote interest in
the use of neural signals—either alone, or in combination with other
data sources—in the context of predicting video memorability by
demonstrating what EEG data can provide. The dataset will be a set of
features pre-extracted from the EEG for a subset of videos from task
1. This demonstration pilot will enable interested researchers to see
how they could use neural signals without any of the requisite domain
knowledge in a future Memorability task, potentially increasing
interdisciplinary interest in the subject of memorability, and opening
the door to novel EEG-computer vision combined approaches to
predicting video memorability.
Pre-selected participants in this pilot demonstration will use the
dataset to explore all manners of machine learning and processing
strategies to predict video memorability. This will lead to a
presentation on their findings, which will ultimately contribute
towards the collaborative definition of a fully-fledged task at
MediaEval 2022, where participating teams will submit runs and be
benchmarked.

***********************
Target communities
***********************
Researchers will find this task interesting if they work in the areas
of human perception and scene understanding, such as image and video
interestingness, memorability, attractiveness, aesthetics prediction,
event detection, multimedia affect and perceptual analysis, multimedia
content analysis, machine learning (though not limited to).

***********************
Data
***********************
The first dataset is composed of a subset of 6,000 short videos
retrieved from TRECVid 2019 Video to Text dataset [1]. Each video
consists of a coherent unit in terms of meaning and is associated with
two scores of memorability that refer to its probability to be
remembered after two different durations of memory retention. Similar
to previous editions of the task [2], memorability has been measured
using recognition tests, i.e., through an objective measure, a few
minutes after the memorisation of the videos (short term), and then 24
to 72 hours later (long term). The videos are shared under Creative
Commons licenses that allow their redistribution. They come with a set
of pre-extracted features, such as: Histograms in the HSV and RGB
spaces, HOG, LBP, and deep features extracted from AlexNet, VGG and
C3D. In comparison to the videos used for this task in 2018 and 2019,
the TRECVid videos have much more action happening in them and thus
are more interesting for subjects to view.

Additionally, we will open the Memento10k dataset to participants.
This dataset contains 10.000 three-second videos depicting in-the-wild
scenes, with their associated short term memorability scores,
memorability decay values, action labels, and 5 accompanying captions.
7000 videos will be released as a training set, and 1500 will be given
for validation. The last 1500 videos will be used as the test set for
scoring submissions. The scores are computed with 90 annotations per
video on average, and the videos were deafened before being shown to
participants. We will also distribute a set of features for each video
analogous to the Trecvid set.

***********************
Annotations
***********************
We need more annotations for the dataset. We kindly ask for your help
to get more annotations. Please visit the link
(https://annotator.uk/mediaeval/index.php) and participate in the
funny game to contribute to the dataset and get familiar with the
data. Thanks in advance for your contribution

******************************
Workshop
******************************
Participants to the task are invited to present their results during
the annual MediaEval Workshop, which will be held in Bergen, Norway
with opportunity for online, on 6-8 December 2021. Working notes
proceedings are to appear with CEUR Workshop Proceedings
(ceur-ws.org).

******************************
Important dates (tentative)
******************************
(open) Participant registration: July
Data release: 15 September
Runs due: 11 November
Working notes papers due: 22 November
MediaEval Workshop: 6-8 December, in Bergen, Norway with opportunity
for online participation

***********************
Task coordination
***********************
Alba García Seco de Herrera, <alba.garcia(at)essex.ac.uk>, University
of Essex, UK
Rukiye Savran Kiziltepe, <rs16419(at)essex.ac.uk>, University of Essex, UK
Mihai Gabriel Constantin, <cmihaigabriel(at)gmail.com>, University
Politehnica of Bucharest, Romania
Bogdan Ionescu, University Politehnica of Bucharest, Romania
Alan Smeaton, Graham Healy, Dublin City University, Ireland
Claire-Hélène Demarty, InterDigital, R&I, France
Sebastian Halder, University of Essex, UK
Ana Matrán-Fernández, University of Essex, UK
Camilo Fosco, Massachusetts Institute of Technology Cambridge,
Massachusetts, USA
Lorin Sweeney, Dublin City University, Ireland
Graham Healy, Dublin City University, Ireland

On behalf of the organizers,
Bogdan Ionescu
https://www.AIMultimediaLab.ro/