[visionlist] Call for Paper: CVPR-`19 Workshop "Analysis and Modeling of Faces and Gestures (AMFG2019)"

Allan Ding allanzmding at gmail.com
Tue Mar 5 09:55:02 -04 2019


We have experienced rapid advances in face, gesture, and cross-modality
(e.g., voice and face) technologies. This is due with many thanks to the
deep learning (i.e., dating back to 2012, AlexNet) and large-scale, labeled
image collections. The progress made in deep learning continues to push
renown public databases to near saturation which, thus, calls more evermore
challenging image collections to be compiled as databases. In practice, and
even widely in applied research, using off-the-shelf deep learning models
has become the norm, as numerous pre-trained networks are available for
download and are readily deployed to new, unseen data (e.g., VGG-Face,
ResNet, amongst other types). We have almost grown “spoiled” from such
luxury, which, in all actuality, has enabled us to stay hidden from many
truths. Theoretically, the truth behind what makes neural networks more
discriminant than ever before is still, in all fairness, unclear—rather,
they act as a sort of black box to most practitioners and even researchers,
alike. More troublesome is the absence of tools to quantitatively and
qualitatively characterize existing deep models, which, in itself, could
yield greater insights about these all so familiar black boxes. With the
frontier moving forward at rates incomparable to any spurt of the past,
challenges such as high variations in illuminations, pose, age, etc., now
confront us. However, state-of-the-art deep learning models often fail when
faced with such challenges owed to the difficulties in modeling structured
data and visual dynamics.

Alongside the effort spent on conventional face recognition is the research
done across modality learning, such as face and voice, gestures in imagery
and motion in videos, along with several other tasks. This line of work has
attracted attention from industry and academic researchers from all sorts
of domains. Additionally, and in some cases with this, there has been a
push to advance these technologies for social media based applications.
Regardless the exact domain and purpose, the following capabilities must be
satisfied: face and body tracking (e.g., facial expression analysis, face
detection, gesture recognition), lip reading and voice understanding, face
and body characterization (e.g., behavioral understanding, emotion
recognition), face, body and gesture characteristic analysis (e.g., gait,
age, gender, ethnicity recognition), group understanding via social cues
(e.g., kinship, non-blood relationships, personality), and visual sentiment
analysis (e.g., temperament, arrangement). Thus, needing to be able to
create effective models for visual certainty has significant value in both
the scientific communities and the commercial market, with applications
that span topics of human-computer interaction, social media analytics,
video indexing, visual surveillance, and internet vision. Currently,
researchers have made significant progress addressing the many of these
problems, and especially when considering off-the-shelf and cost-efficient
vision HW products available these days, e.g. Intel RealSense, Magic Leap,
SHORE, and Affdex. Nonetheless, serious challenges still remain, which only
amplifies when considering the unconstrained imaging conditions captured by
different sources focused on non-cooperative subjects. It is these latter
challenges that especially grabs our interest, as we sought out to bring
together the cutting-edge techniques and recent advances of deep learning
to solve the challenges in the wild.

This one-day serial workshop (i.e., AMFG2019) provides a forum for
researchers to review the recent progress of recognition, analysis, and
modeling of face, body, and gesture, while embracing the most advanced deep
learning systems available for face and gesture analysis, particularly,
under an unconstrained environment like social media and across modalities
like face to voice. The workshop includes up to 3 keynotes and
peer-reviewed papers (oral and poster). Original high-quality contributions
are solicited on the following topics:
Deep learning methodology, theory, as applied to social media analytics;
Data-driven or physics-based generative models for faces, poses, and
gestures;Deep learning for internet-scale soft biometrics and profiling:
age, gender, ethnicity, personality, kinship, occupation, beauty ranking,
and fashion classification by facial or body descriptor;
Novel deep model, deep learning survey, or comparative study for
face/gesture recognition;
Deep learning for detection and recognition of faces and bodies with large
3D rotation, illumination change, partial occlusion, unknown/changing
background, and aging (i.e., in the wild); especially large 3D rotation
robust face and gesture recognition;
Motion analysis, tracking and extraction of face and body models captured
from several non-overlapping views;
Face, gait, and action recognition in low-quality (e.g., blurred), or
low-resolution video from fixed or mobile device cameras;
AutoML for face and gesture analysis;
Mathematical models and algorithms, sensors and modalities for face & body
gesture and action representation, analysis, and recognition for
cross-domain social media;
Social/psychological based studies that aids in understanding computational
modeling and building better automated face and gesture systems with
interactive features;
Multimedia learning models involving faces and gestures (e.g., voice,
wearable IMUs, and face);
Social applications involving detection, tracking & recognition of face,
body, and action;
Face and gesture analysis for sentiment analysis in social context;
Other applications involving face and gesture analysis in social media
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://visionscience.com/pipermail/visionlist_visionscience.com/attachments/20190305/9b7d573a/attachment-0001.html>

More information about the visionlist mailing list