[visionlist] CFP: CVPR Workshop on Language and Vision

Siddharth N siddharth at iffsid.com
Mon Mar 4 11:34:06 -04 2019

Fourth Workshop on Language and Vision @CVPR19


 June 16, 2019 @ Long Beach, CA
 in conjunction with CVPR 2019


The interaction between language and vision, despite seeing traction
as of late, is still largely unexplored. This is a particularly
relevant topic to the vision community because humans routinely
perform tasks which involve both modalities. We do so largely without
even noticing. Every time you ask for an object, ask someone to
imagine a scene, or describe what you're seeing, you're performing a
task which bridges a linguistic and a visual representation. The
importance of vision-language interaction can also be seen by the
numerous approaches that often cross domains, such as the popularity
of image grammars. More concretely, we've recently seen a renewed
interest in one-shot learning for object and event models. Humans go
further than this using our linguistic abilities; we perform zero-shot
learning without seeing a single example. You can recognize a picture
of a zebra after hearing the description "horse-like animal with black
and white stripes" without ever having seen one.

Furthermore, integrating language with vision brings with it the
possibility of expanding the horizons and tasks of the vision
community. We have seen significant growth in image and video-to-text
tasks but many other potential applications of such integration –
answering questions, dialog systems, and grounded language acquisition
– remain largely unexplored. Going beyond such novel tasks, language
can make a deeper contribution to vision: it provides a prism through
which to understand the world. A major difference between human and
machine vision is that humans form a coherent and global understanding
of a scene. This process is facilitated by our ability to affect our
perception with high-level knowledge which provides resilience in the
face of errors from low-level perception. It also provides a framework
through which one can learn about the world: language can be used to
describe many phenomena succinctly thereby helping filter out
irrelevant details.

Topics covered (non-exhaustive):

- language as a mechanism to structure and reason about visual perception,
- language as a learning bias to aid vision in both machines and humans,
- novel tasks which combine language and vision,
- dialogue as means of sharing knowledge about visual perception,
- stories as means of abstraction,
- transfer learning across language and vision,
- understanding the relationship between language and vision in humans,
- how computer vision can explain the neural underpinning of language and vision,
- reasoning visually about language problems, and
- joint video and language alignment and parsing

Call for papers

We are calling for submissions to be showcased at a poster session, with
selected papers given spotlights.
In the spirit of fostering a freer exchange of ideas, we welcome both novel
and previously-published work. Submissions can hence be:
 a. 2-4 page extended abstracts (excluding references), or
 b. relevant full submissions (from current or recent conferences) for
    which the authors will have the option of providing an arXiv link.

Submissions are *not* archival, and will not be included in the
Proceedings of CVPR 2019, although links to papers can be provided for
display on the workshop website.

SUBMISSION DEADLINE: May 31st 2019 anywhere on earth (AoE).

Organized by:

N. Siddharth, University of Oxford
Andrei Barbu, MIT
Dan Gutfreund, IBM
Philip Torr, University of Oxford

More information about the visionlist mailing list