[visionlist] Call for papers: Second Large Scale Holistic Video Understanding Workshop @ CVPR’21

Vivek Sharma vvsharma at mit.edu
Wed Mar 10 16:45:39 -04 2021

[Apologies for multiple copies due to cross-posting. Please forward to colleagues who might be interested]

Second Large Scale Holistic Video Understanding Workshop @CVPR’21

CVPR Dates: June 19-25, 2021 / Workshop Date: TBD




CAMERA READY:  April 18, 2021

Please submit papers via CMT: https://cmt3.research.microsoft.com/HVU2021

WORKSHOP REGISTRATION: In conjunction with CVPR’21


In the last years, we have seen tremendous progress in the capabilities of computer systems to classify video clips taken from the Internet or to analyze human actions in videos. There are lots of works in video recognition field focusing on specific video understanding tasks, such as action recognition, scene understanding, etc. There have been great achievements in such tasks, however, there has not been enough attention toward the holistic video understanding task as a problem to be tackled. Current systems are expert in some specific fields of the general video understanding problem. However, for real-world applications, such as, analyzing multiple concepts of a video for video search engines and media monitoring systems or providing an appropriate definition of the surrounding environment of a humanoid robot, a combination of current state-of-the-art methods should be used. Therefore, in this workshop, we intend to introduce holistic video understanding as a new challenge for the video understanding efforts. This challenge focuses on the recognition of scenes, objects, actions, attributes, and events in the real-world user-generated videos. To be able to address such tasks, we also introduce our new dataset named Holistic Video Understanding (HVU dataset) that is organized hierarchically in a semantic taxonomy of holistic video understanding. Almost all of the real-world conditioned video datasets are targeting human action or sport recognition. So, our new dataset can help the vision community and bring more attention to bring more interesting solutions for holistic video understanding. The workshop is tailored to bringing together ideas around multi-label and multi-task recognition of different semantic concepts in the real-world videos. And the research efforts can be tried on our new dataset. HVU Dataset: https://github.com/holistic-video-understanding


  *   Large scale video understanding

  *   Multi-Modal learning from videos

  *   Multi-concept recognition from videos

  *   Multi-task deep neural networks for videos

  *   Learning holistic representation from videos

  *   Weakly supervised learning from web videos

  *   Object, scene and event recognition from videos

  *   Unsupervised video visual representation learning

  *   Unsupervised and self-­supervised learning with videos


Cordelia Schmid, Google AI
Joao Carreira, Google DeepMind
Carl Vondrick, Columbia University
Dima Damen, University of Bristol
Sanja Fidler, University of Toronto
Kristen Grauman, University of Texas at Austin

For questions about the HVU workshop, please contact fayyaz at iai.uni-bonn.de<mailto:fayyaz at iai.uni-bonn.de>. Also, follow HVU on Twitter for the latest news: https://twitter.com/LSHVU or https://holistic-video-understanding.github.io/


Mohsen Fayyaz, University of Bonn

Ali Diba, KU Leuven

Vivek Sharma, Harvard, MIT

Juergen Gall, University of Bonn

Ehsan Adeli, Stanford University

Rainer Stiefelhagen, KIT

Luc Van Gool, ETH Zurich & KU Leuven

David Ross, Google AI

Manohar Paluri, Facebook AI

best, Vivek

Vivek Sharma,
Massachusetts Institute of Technology (MIT), USA
Harvard Medical School, Harvard University, USA
Web: http://media.mit.edu/~vvsharma

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist_visionscience.com/attachments/20210310/b0bd0015/attachment.html>

More information about the visionlist mailing list