[visionlist] Engineer position at Inria: 3D Human Pose Estimation from a Single Image with Deep Learning

Adnane Boukhayma adnane.boukhayma at gmail.com
Thu Apr 2 04:47:55 -04 2020

3D Human Pose Estimation from a Single Image with Deep Learning:


The engineer will work closely with Dr. Adnane Boukhayma and Prof. Franck
Multon. The work will be conducted at Inria Rennes in the MimeTIC research
team.  This position takes part in the KIMEA Cloud project, a collaboration
between Inria Rennes and start-ups Moovency and Quortex. The goal of this
project is to asses the risk of musculoskeletal disorders from a
smartphone. The manufacturing industry is the sector most affected by
musculoskeletal disorders, in particular due to repetitive gestures and
frequent load transport. These companies do not necessarily have internal
ergonomics resources and cannot always invest in technological tools. Given
simply a video of the worker in his workstation, a Deep Learning based
algorithm will estimate the 3D positions of the person’s joints. The
musculoskeletal risks will be subsequently analyzed automatically from
these 3D postures. The role of Inria in this project is to research and
develop a robust solution for 3D human pose estimation from color images in
the wild, particularly in the industrial context.


3D human pose estimation is one of the fundamental problems and most active
research areas in  computer vision with various applications in many fields
such as action recognition, human-machine interfaces, special effects and
telepresence. Despite recent advances in the scientific community,
monocular 3D human pose estimation in natural images remains far from being

The recent surge of Deep Learning allowed a substantial improvement in the
performance of state-of-the-art  methods on 2D and 3D human pose
estimation. In particular, a family of 3D pose estimators cast the problem
as lifting from 2D to 3D predictions (e.g. [1,2,3,4]). They generally
outperform the end-to-end counterparts since they benefit from the
remarkable current performances of 2D pose estimators, and due in part to
the lack of massive training image data with ground-truth 3D pose
annotations. We propose to follow this direction at first, reproduce
state-of-the-art results and explore further improvements and new
approaches to allow in particular better generalization to natural images
and challenging capture conditions, reducing dependencies to 2D
predictions, and using incremental learning to update the learned models
with new learning examples on the fly.

Within this role, the engineer will lead the development of a deep learning
based method for 3D human pose estimation from a single color image. He/she
could also participate in the research part of the project.  The results of
these works are expected be published in top tier computer vision
conferences such as CVPR, ICCV, ECCV, etc.

We propose the following course of action:
- 2D to 3D pose estimation lifting:
Developing a Deep Learning method allowing to obtain 3D poses from 2D
poses. This task notably involves generating a simulated 2D/3D learning set
from 3D motion capture. The challenges are to be able to manage erroneous
2D skeletons in the event of large occlusions, and the multitude of
possible 3D points of view.
- Combining end-to-end 3D pose estimation and 2D-3D lifting:
Developing a Deep Learning method for 3D human pose estimation that can
learn simultaneously from image/3D, image/2D and 2D/3D annotation pairs.
Test cases include industrial postures and environments,  as well as severe
capture conditions.
- Incremental learning:
Developing a method that allows the learning models to adapt in an
incremental way to new learning data without forgetting their existing
knowledge. The objective is to avoid relaunching a total learning of the
Deep Learning network with each new example that we would like to add.

[1] Multi-person 2d and 3d pose detection in natural images. TPAMI, 2019.
[2] 3d human pose estimation = 2d pose estimation + matching. CVPR, 2017.
[3] A simple yet effective baseline for 3d human pose estimation. ICCV,
[4] 3d human pose estimation in the wild by adversarial learning. CVPR,


The engineer will be tasked with:
- Developing a program allowing 3D human pose estimation from single color
images in the wild. The solution will be tested on industrial use cases
with possible occlusions and extreme capture situations.
- Depending on the progress of the project, developing an incremental
learning solution allowing the aforementioned 3D pose estimation model to
learn from new example cases without the need to retrain on all data, and
without any loss in the models performance.

In practice, these tasks imply:
- Participating in the research discussions and algorithms design.
- Reading and implementing research papers.
- Reproducing state-of-the-art results.
- Implementing the ideas proposed by the research collaborators.
- Creating training and testing datasets.
- Participating in the publication of the research results.


- Candidates should preferably have a MSc or PhD in computer science,
applied mathematics, computer vision, computer graphics or machine
- The ability to read, understand and implement research papers and
reproduce scientific results.
- Good coding skills (Python, C, C++).
- Proficiency in deep learning frameworks such as Pytorch is a plus.

*The keys to success:*

We are looking for excellent candidates, preferably with a solid background
in mathematics or computer science and good coding skills, who can work
independently and who are also keen to collaborate with other researchers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist_visionscience.com/attachments/20200402/c12535e8/attachment.html>

More information about the visionlist mailing list