Research group on the applications of machine learning to compression

We started a new study group on the applications of machine learning to compression. Check it out there:

Machine learning compression | Research group on the applications of machine learning to compression

Simulation of conversational groups of virtual agents expressing interpersonal attitudes

The system presented in the following video is aimed at simulating conversational groups by virtual agents. Each agent is endowed with a computational model that allows it to modulate its nonverbal behaviors (orientation of the body, distance with other agents, gaze, facial expression and gestures) and to adapt its turn-taking strategies depending on the attitudes it expresses towards the others in real time. This allows for the generation of various conversational styles supporting the various roles the agents may have in the group (e.g. leader or friend).


A new acted emotional body behavior database

We collected a new corpus for the analysis and the synthesis of emotional body movements. The movements of 11 actors (6 female and 5 male) were captured while they expressed 8 emotional states (Joy, Anger, Panic Fear, Anxiety, Sadness, Shame, Pride and Neutral) described with 3 scenarios and performed through 7 different movements tasks. Each movement task was performed several times to capture a wide range of data.


Fourati, N., Pelachaud, C., A new acted emotional body behavior database. IVA 2013 Workshop on Multimodal Corpora (MMC2013): Beyond Audio and Video, 2013

Expressive rendering and animation

Facial expression is a rather complex system. It results from muscle contraction. Wrinkles often appear on the face and tend to occur as the results of some facial expressions. They are definitely part of the expression. FACS describes, for each AU, not only the changes due to muscle contraction but also wrinkle appearance. As facial actions can be extremely subtle, we aim to measure how wrinkles can enhance the recognition of facial action at a perceptual level in this work.


Radoslaw Niewiadomski, Jing Huang, and Catherine Pelachaud. Effect of facial cues on identification. In The 25th Annual Conference on Computer Animation and Social Agents (CASA 2012), page 4, Singapore, 2012


We detailed how to animate a human character using its skeleton, by controlling a very small number of articulations, while using a novel Inverse Kinematics method to optimize the global skeleton’s structure. We also show how to combine these standard deformation pipelines with our novel Expressive animation solution, that improves state-of-the-art’s methods. We show the benefits of this technique in the context of human-machine interaction systems, where realism is a key to the comfort of the interaction between a human and an embodied conversational agent.


Jing Huang and Catherine Pelachaud. Expressive body animation pipeline for virtual agent. In Yukiko Nakano, Michael Neff, Ana Paiva, and Marilyn Walker, editors, Intelligent Virtual Agents, 12th International Conference on Intelligent Virtual Agents, volume 7502 of Lecture Notes in Computer Science, pages 355–362. Springer Berlin Heidelberg, 2012

Jing Huang and Catherine Pelachaud. An efficient energy transfer inverse kinematics solution. In Proceedings of Motion In Game 2012, volume 7660, pages 278–289, Berlin, Heidelberg, 2012. LNCS


Etoile Plugin System: a dynamic linked plug-in system. It uses a graph-node flow to control different modules in the system. Rendering, physics and certain multi-media components have been integrated.


Modeling Multimodal Behaviors From Speech Prosody

To synthesize the head and eyebrow movements of virtual characters, we have developed a fully parameterized Hidden Markov Model (FPHMM), an extension of a Contextual HMM (CHMM). We have first learned a FPHMM that takes speech features as contextual variables and that produces motion features observation. During the training phase, motion and speech streams are both used to learn a FPHMM. During the motion synthesis phase (i.e. the animation generation phase), only the speech stream is known.



Y. Ding, C. Pelachaud and T. Artières, Modeling Multimodal Behaviors from Speech Prosody, Intelligent Virtual Agents, August 2013, vol. 8108, pp. 217-228.

Y. Ding, M. Radenen, T. Artières and C. Pelachaud, Speech-driven eyebrow motion synthesis with contextual markovian models, ICASSP, Canada, May 2013, pp. 3756-3760.

Demo of Nao with Expressive Gesture

Within the French project ANR GV-LeX, we have developed an expressive communicative gesture model for the humanoid robot Nao. The goal of this project aims at equipping a humanoid robot with a capacity of producing gestures while telling a story for children. To reach this objective, we have extended and developed our existing virtual agent platform GRETA to be adapted to the robot. Gestural prototypes are described symbolically and stored in a gestural database, called lexicon. Given a set of intentions and emotional states to communicate the system selects from the robot lexicon corresponding gestures. After that the selected gestures are planned to synchronize speech and then instantiated in robot joint values while taking into account parameters of gestural expressivity of temporal extension, spatial extension and repetitivity.


Q. A. Le, C. Pelachaud, Evaluating an Expressive Gesture Model for a Humanoid Robot: Experimental Results, submitted to the 8th ACM/IEEE International Conference on Human Robot Interaction (HRI 2013), 3-6 March, 2013, Tokyo, Japan

Q. A. Le, J.-F. Huang, C. Pelachaud, A Common Gesture and Speech Production Framework for Virtual and Physical Agents, accepted to the 14th ACM International Conference on Multimodal Interaction, Workshop on Speech and Gesture Production in Virtually and Physically Embodied Conversational Agents, 2012, Santa Monica, CA, USA.

Le, Q.A., Hanoune, S. and Pelachaud, C. Design and implementation of an expressive gesture model for a humanoid robot. 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2011), Bled, Slovenia on October 26th to 28th, 2011.

Le, Q.A. and Pelachaud, C. Expressive Gesture Model for Humanoid Robot. International Conference of the HUMAINE Association on Affective Computing and Intelligent Interaction (ACII2011), Memphis, Tennessee, USA on October 9th to 12th, 2011.

Le Quoc Anh, Catherine Pelachaud. Generating co-speech gestures for the humanoid robot NAO through BML. Lecture Notes in Computer Science (LNCS) 2011: The 9th International Gesture Workshop. Gesture in Embodied Communication and Human-Computer Interaction, May 25-27, 2011, Athens, Greece.

Demo E-SmilesCreator and GenAttitude

One of the key challenges in the development of social virtual actors is to give them the capability to display socio-emotional states through their non-verbal behavior. Based on studies in human and social sciences or on annotated corpora of human expressions, di fferent models to synthesize virtual agent’s non-verbal behavior have been developed. One of the major issues in the synthesis of behavior using a corpus-based approach is collecting datasets, which can be difficult, time consuming and expensive to collect and annotate. A growing interest in using crowdsourcing to collect and annotate datasets has been observed in recent years. In this paper, we have implemented a toolbox to easily develop online crowdsourcing tools to build a corpus of virtual agent’s non-verbal behaviors directly rated by users. We present two developed online crowdsourcing tools that have been used to construct a repertoire of virtual smiles and to de ne virtual agents’ non-verbal behaviors associated to social attitudes.





A crowdsourcing method for a user-perception based design of social virtual actors
Magalie Ochs, Brian Ravenet, and Catherine Pelachaud
International Workshop “Computers are Social Actors” (CASA), Intelligent Virtual Agent Conference (IVA),  Edinburgh, Scotland, 2013

From a User-Created Corpus of Virtual Agent’s Non-Verbal Behavior to a Computational Model of Interpersonal Attitudes
Brian Ravenet, Magalie Ochs and Catherine Pelachaud
Intelligent Virtual Agent Conference (IVA), Edinburgh, Scotland, 2013

Socially Aware Virtual Characters: The Social Signal of Smiles
Magalie Ochs et Catherine Pelachaud
IEEE Signal Processing Magazine, Vol 30 (2), p. 128-132, March 2013

Smiling Virtual Agent in Social Context
Magalie Ochs, Radoslaw Niewiadomski, Paul Brunet, and Catherine Pelachaud
Cognitive Processing, Special Issue on “Social Agents. From Theory to Applications” (impact factor: 1.754), vol. 13 (22), pages 519-532, 2012.



Séminaire d’Edoardo Provenzi

Edoardo Provenzi (currently postdoc at TSI) will present his work about:

“Perceptually-inspired enhancement of color LDR and HDR images: a variational perspective”
On Tuesday October 15th at 10:30 in  room DB312 (Dareau site).
The seminar will be devoted to discuss a recently proposed variational framework, both in the spatial and in the wavelet domain, that can embed several existing perceptually-inspired color enhacement algorithms. It can be proven that the human visual system properties are satisfied only by a class of energy functionals, which are given by the balance between a local and illumination-invariant contrast enhancement and an entropy-like adjustment to the average radiance. Within this framework, new measures of perceived contrast are proposed, however, while their mathematical definition is firm, their psychophysical validation is still lacking. Rigurous experiments performed with high dynamic range screens may provide a solution to this problem.
Short bio:
Edoardo Provenzi got the Master Degree in Physics from the University of Milano, Italy, in 2000 and the PhD in Mathematics from the University of Genova, Italy, in 2004. His works in computer vision span different discipline: mathematical foundation of perceptually-inspired color correction algorithms, variational and wavelet analysis of perceived contrast, high dynamic range imaging, motion segmentation and optimal histogram transportation. At the moment, he is a post-doc researcher at Telecom ParisTech.

Snowball effect

Agents start the interaction di-synchronised and after a while stabilise in synchrony

Demonstrations of SSPNet project

SSPNet (Social Signal Processing Network) is an European Network of Excellence (NoE) which addresses Social Signal Processing.

SSPNet activities revolve around two research foci selected for their primacy in our everyday life:

* Social Signal Processing in Human-Human Interaction
* Social Signal Processing in Human-Computer Interaction

Hence,the main focus of the SSPNet is on developing and validating the scientific foundations and engineering principles (including resources for experimentation) required to address the problems of social behaviour analysis,interpretation,and synthesis. The project focuses on multimodal approaches aimed at:(i) interpreting information for better understanding of human social signals and implicit intentions,and (ii) generating socially adept behaviour of embodied conversational agents. It will consider how we can model,represent,and employ human social signals and behaviours to design autonomous systems able to know,either through their design or via a process of learning,how to understand and respond to human communicative signals and behaviour.

Different types of virtual characters smiles

In this context, our focus of research is to study how humans-virtual characters communication can be facilitated by the appropriate non-verbal behaviours of the virtual characters. As a first step, we have studied different types of virtual characters smiles (amused smile, polite smile, etc). In the following videos, you can see different types of virtual characters smiles, as well as example of virtual characters which express different smiles when they tell a funny story.


Ochs, M., Niewiadomski, R., Pelachaud, C., How a Virtual Agent Should Smile? – Morphological and Dynamic Characteristics of Virtual Agent’s Smiles, in Proceedings of the 10th International Conference on Intelligent Virtual Agents, Philadelphia, USA, pp. 427-440, 2010.

Synchrony emergence between dialog partners

Another aspect of making humans-conversationnal agents communication easier, is to enable dynamics linked to the quality of the interaction to emerge within the dyad (or within the group of interactants). Among other, during dialog, synchrony between non-verbal behaviours of agents is characteristic of the quality of their communication, i.e. it depends on their mutual understanding and on the amount of information they exchange.

In the following videos, you can see synchrony appear between agents just by the settling of a coupling between them when they both understand and perceive each other.

Agents start the interaction di-synchronised and after a while stabilised in synchrony

Up-right agent does not understand what is said (impossible coupling), down-right agent understands what is said, sees the speaker but is not seen back (impossible coupling), up-left agent understands what is said, perceives and is perceived by the speaker (coupling and synchrony constitute the stable state of the dyad they form).


K. Prépin and C. Pelachaud, Shared Understanding and Synchrony Emergence: Synchrony as an Indice of the Exchange of Meaning between Dialog Partners, Third International Conference on Agents and Artificial Intelligence, ICAART2011, Rome, Italie, January 2011, pp. 1-10 [pdf]