[visionlist] Fwd: [Job Ad] Postdoc Position: Gesture Generation in Face-to-Face Dialogue

Thu Sep 4 09:37:54 -05 2025

Postdoc Position: Gesture Generation in Face-to-Face Dialogue

We are looking for a postdoctoral researcher with experience in generative
AI, multimodal representation learning, and modelling face-to-face
dialogues for our NWO-funded project “Grounded Gesture Generation in
Context: Object- and Interaction-Aware Generative AI Models of Language
Use”. Preferred start date is 1 February 2026 but is negotiable.
*Job description*

Face-to-face conversation is the primary setting for human language, where
meaning and coordination arise from interplay between speech, prosody,
gesture, facial expression, and gaze. Virtual humans are now prevalent in
social media, education, and healthcare. However, their non-verbal
behaviour, especially gestures, still lags behind state-of-the-art. This
project tackles that gap by building generative AI models that produce
context-aware, grounded gestures: responsive to objects and to
interlocutors, aligning speech and visual signals for richer, more natural
interaction. In this project, you will carry out research as described in
the project proposal, which includes:

   - Developing generative models that produce gestures grounded in
   context, e.g., reacting to a partner’s gesture, resembling an object they
   refer to in the environment.
   - Exploring how these models can enhance human-computer interaction and
   improve our understanding of multimodal communication.

To do so, you will have full access to motion-capture and virtual-reality
labs, 3D animation tools, and GPU-based high-performance computing at MPI.
You will also be embedded in a rich theoretical and computational
environment supported by the Multimodal Language Department.

*Requirements*
  *Essentials*

   - PhD (completed or near completion) in Computer Science, Computer
   Vision, NLP, Machine Learning, Computer Graphics/Animation, HCI, or a
   related field.
   - Strong background in deep generative modelling
   (diffusion/transformers), multimodal representation learning, and
   experience in computer graphics/animation.
   - Strong programming skills, especially in Python. Ideally, you have
   experience with PyTorch.
   - Track record of publishing in top Vision/Graphics/Animation and
   Human-Computer Interaction conferences/journals such as CVPR, ICCV, ICMI,
   SIGGRAPH, or similar conferences/journals.
   - Strong communicative skills in English, both in writing and speech.

*Desirable*

   - Comfortable with prototyping experiments using motion capture and
   virtual reality devices, 3D animation software, and relevant programming
   languages (e.g., Unreal Engine, Maya, Motion Builder) and with human
   participants.
   - A strong interest in conducting research in the area of multimodal
   language production and visual communicative behaviours of 3D virtual
   humans (i.e., gesture, eye gaze, face).
   - Good critical thinking and collaboration skills.

*What we offer you*

A challenging position in a scientifically engaged organisation. At the
MPI, you contribute to fundamental research. In return for your efforts, we
offer you:

   - A position for 0.8 FTE. Your employment contract will initially last 1
   year. Extension might be possible depending on new funding or candidate
   performance. A higher FTE employment is possible, but it shortens the time
   of employment. Access to cutting-edge computing and animation facilities
   and a supportive, interdisciplinary team.
   - Structured support for your next career steps, whether in academia or
   industry. For example, the successful candidate will receive help and
   support for securing follow-up funding (e.g., Veni/NWO XS grants, etc.) or
   for moving to the next career stage.
   - Flexible working conditions: hybrid work, collaborative culture.
   - Salary is in accordance with the German collective labour agreement
   TVöD (Tarifvertrag für den öffentlichen Dienst) and is classified in salary
   group E13.
   - Pension scheme.
   - Thirty days of leave per year (full-time employment).

*Application Procedure*

Do you recognise yourself in the job profile? Then we look forward to
receiving your application. The deadline to submit your application is
Friday, 31 October 2025, 23:59 (Amsterdam time).  You can apply online *VIA
THIS *
<https://recruitingapp-5569.de.umantis.com/Vacancies/463/Application/CheckLogin/2?lang=eng>

*LINK*
<https://recruitingapp-5569.de.umantis.com/Vacancies/463/Application/CheckLogin/2?lang=eng>.
Applications should include the following information:

   - A one-page motivation letter, explaining why you are a good fit for
   this position and including a link to a publication that is relevant to the
   current project.
   - A CV, including a list of publications, and a link to your PhD thesis.
   - Contact details of two people who could be asked for recommendation
   letters.

*About the project*

This project will be advised by  <https://esamghaleb.github.io/>

Dr. Esam Ghaleb <https://esamghaleb.github.io/>, who is a research
scientist in the Multimodal Language Department in the area of computer
vision, machine learning, and behaviour & dialogue modelling. If you have
questions about the position you wish to discuss before applying, please
contact Esam Ghaleb: esam.ghaleb at mpi.nl.

*The Employer*
*About our institute*

The Max Planck Institute (MPI) for Psycholinguistics <https://www.mpi.nl/> is
a world-leading research institute devoted to interdisciplinary studies of
the science of language and communication, including departments on
genetics, psychology, development, neurobiology, and multimodality of these
fundamental human abilities.

We investigate how children and adults acquire their language(s), how
speaking and listening happen in real-time, how the brain processes
language, how the human genome contributes to building a language-ready
brain, how multiple modalities (as in speech, gesture, and sign) shape
language and its use in diverse languages and how language is related to
cognition and culture, and shaped by evolution.

We are part of the  <https://www.mpg.de/en>

Max Planck Society <https://www.mpg.de/en>, an independent non-governmental
association of German-funded research institutes dedicated to fundamental
research in the natural sciences, life sciences, social sciences, and the
humanities.

The Max Planck Society is an equal opportunities employer
<https://www.mpi.nl/page/equal-opportunities>. We recognize the positive
value of diversity and inclusion, promote equity, and challenge
discrimination. We aim to provide a working environment with room for
differences, where everyone feels a sense of belonging. Therefore, we
welcome applications from all suitably qualified candidates.

Our institute is situated on the campus of the Radboud University and has
close collaborative links with the  <https://www.ru.nl/donders/>
Donders Institute for Brain, Cognition and Behavior
<https://www.ru.nl/donders/> and the  <https://www.ru.nl/cls/>Centre for
Language Studies <https://www.ru.nl/cls/> at the Radboud University. We
also work closely with other child development researchers as part of the
<https://www.babyandchild.nl/en>

Baby & Child Research Center <https://www.babyandchild.nl/en>.

Staff and students at the MPI have access to state-of-the-art research and
training facilities <https://www.mpi.nl/research-facilities>.

*About the Multimodal Language Department*

The Multimodal Language Department
<https://www.mpi.nl/department/multimodal-language-department/23> in
particular aims to understand the cognitive and social foundations of the
human ability for language and its evolution by focusing on its multimodal
aspect and crosslinguistic diversity. The research at the department
combines multiple methods including corpus and computational linguistics,
psycho- and neuro-linguistics, machine learning, AI and virtual reality,
and is concerned with various populations ranging from speakers of signed
and spoken languages, young and older subjects from typical and atypical
populations. The department provides opportunities for training in a range
of linguistic, and conversational state of the art multimodal language
analysis (such as motion capture and automatic speech recognition), as well
as neuropsychological, psychological methods related to multimodal language
and frequent research and public engagement meetings, and support from an
excellent team of researchers in linguistics and psycholinguistics.

Kind regards,

*Esam Ghaleb, PhD*

Research Staff at the Multimodal Language Department
Max Planck Institute for Psycholinguistics

https://esamghaleb.github.io/
<https://esamghaleb.github.io/>https://www.mpi.nl/people/ghaleb-esam

Room 309 | Wundtlaan 1 | 6525 XD Nijmegen | NL
P.O. Box 302 | 6500 AH Nijmegen | NL
I: www.mpi.nl | Twitter: @MPI_NL <https://twitter.com/MPI_NL>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist_visionscience.com/attachments/20250904/5d0d6e9f/attachment.html>