Open position in the team

Source:  Marco Cagnazzo’s web site

PhD Defense: Shuo Zheng

Shuo Zheng’s PhD defense will take place at 5th February, 10 am, Amphi Opale at TélécomParisTech (46 rue Barrault, 75013 Paris).


  • Mr François-Xavier Coudoux, Université Polytechnique Hauts-de-France, Referee
  • Mrs Aline Roumy, INRIA Rennes, Referee
  • Mr Jean-Marie Gorce, INSA Lyon, Examiner
  • Mr Marc Leny, Ektacom, Examiner
  • Mrs Michèle Wigger, TélécomParitech, Examiner, Jury’s Chair
  • Mr Marco Cagnazzo, TélécomParisTech, Advisor
  • Mr Michel Kieffer, Université de Paris-sud, Advisor

Title: Accounting for Channel Constraints in Joint Source-Channel Video Coding Schemes

Abstract: SoftCast based Linear Video Coding (LVC) schemes have been emerged in the last decade as a quasi analog joint-source-channel alternative to classical video coding schemes. Theoretical analyses have shown that analog coding is better than digital coding in a multicast scenario when the channel signal-to-noise ratios (C-SNR) dier among receivers. LVC schemes provide in such context a decoded video quality at dierent receivers proportional to their C-SNR. This thesis considers rst the channel precoding and decoding matrix design problem for LVC schemes under a per-subchannel power constraint. Such constraint is found, e.g., on Power Line Telecommunication (PLT) channels and is similar to per-antenna power constraints in multi-antenna transmission system. An optimal design approach is proposed, involving a multi-level water lling algorithm and the solution of a structured Hermitian Inverse Eigenvalue problem. Three lower-complexity alternative suboptimal algorithms are also proposed. Extensive experiments show that the suboptimal algorithms perform closely to the optimal one and can reduce signicantly the complexity. The precoding matrix design in multicast situations also has been considered. A second main contribution consists in an impulse noise mitigation approach for LVC schemes. Impulse noise identication and correction can be formulated as a sparse vector recovery problem. A Fast Bayesian Matching Pursuit (FBMP) algorithm is adapted to LVC schemes. Subchannels provisioning for impulse noise mitigation is necessary, leading to a nominal video quality decrease in absence of impulse noise. A phenomenological model (PM) is proposed to describe the impulse noise correction residual. Using the PM model, an algorithm to evaluate the optimal number of subchannels to provision is proposed. Simulation results show that the proposed algorithms signicantly improve the video quality when transmitted over channels prone to impulse noise.

HCERES activity report

Team activity has been presented in the context of the LTCI lab evaluation by HCERES.

LTCI research day

We contribute to the lab research day with the poster of our recent MMSP paper:

PDF file

Research group on the applications of machine learning to compression

We started a new study group on the applications of machine learning to compression. Check it out there:

Machine learning compression | Research group on the applications of machine learning to compression

Simulation of conversational groups of virtual agents expressing interpersonal attitudes

The system presented in the following video is aimed at simulating conversational groups by virtual agents. Each agent is endowed with a computational model that allows it to modulate its nonverbal behaviors (orientation of the body, distance with other agents, gaze, facial expression and gestures) and to adapt its turn-taking strategies depending on the attitudes it expresses towards the others in real time. This allows for the generation of various conversational styles supporting the various roles the agents may have in the group (e.g. leader or friend).


A new acted emotional body behavior database

We collected a new corpus for the analysis and the synthesis of emotional body movements. The movements of 11 actors (6 female and 5 male) were captured while they expressed 8 emotional states (Joy, Anger, Panic Fear, Anxiety, Sadness, Shame, Pride and Neutral) described with 3 scenarios and performed through 7 different movements tasks. Each movement task was performed several times to capture a wide range of data.


Fourati, N., Pelachaud, C., A new acted emotional body behavior database. IVA 2013 Workshop on Multimodal Corpora (MMC2013): Beyond Audio and Video, 2013

Expressive rendering and animation

Facial expression is a rather complex system. It results from muscle contraction. Wrinkles often appear on the face and tend to occur as the results of some facial expressions. They are definitely part of the expression. FACS describes, for each AU, not only the changes due to muscle contraction but also wrinkle appearance. As facial actions can be extremely subtle, we aim to measure how wrinkles can enhance the recognition of facial action at a perceptual level in this work.


Radoslaw Niewiadomski, Jing Huang, and Catherine Pelachaud. Effect of facial cues on identification. In The 25th Annual Conference on Computer Animation and Social Agents (CASA 2012), page 4, Singapore, 2012


We detailed how to animate a human character using its skeleton, by controlling a very small number of articulations, while using a novel Inverse Kinematics method to optimize the global skeleton’s structure. We also show how to combine these standard deformation pipelines with our novel Expressive animation solution, that improves state-of-the-art’s methods. We show the benefits of this technique in the context of human-machine interaction systems, where realism is a key to the comfort of the interaction between a human and an embodied conversational agent.


Jing Huang and Catherine Pelachaud. Expressive body animation pipeline for virtual agent. In Yukiko Nakano, Michael Neff, Ana Paiva, and Marilyn Walker, editors, Intelligent Virtual Agents, 12th International Conference on Intelligent Virtual Agents, volume 7502 of Lecture Notes in Computer Science, pages 355–362. Springer Berlin Heidelberg, 2012

Jing Huang and Catherine Pelachaud. An efficient energy transfer inverse kinematics solution. In Proceedings of Motion In Game 2012, volume 7660, pages 278–289, Berlin, Heidelberg, 2012. LNCS


Etoile Plugin System: a dynamic linked plug-in system. It uses a graph-node flow to control different modules in the system. Rendering, physics and certain multi-media components have been integrated.


Modeling Multimodal Behaviors From Speech Prosody

To synthesize the head and eyebrow movements of virtual characters, we have developed a fully parameterized Hidden Markov Model (FPHMM), an extension of a Contextual HMM (CHMM). We have first learned a FPHMM that takes speech features as contextual variables and that produces motion features observation. During the training phase, motion and speech streams are both used to learn a FPHMM. During the motion synthesis phase (i.e. the animation generation phase), only the speech stream is known.



Y. Ding, C. Pelachaud and T. Artières, Modeling Multimodal Behaviors from Speech Prosody, Intelligent Virtual Agents, August 2013, vol. 8108, pp. 217-228.

Y. Ding, M. Radenen, T. Artières and C. Pelachaud, Speech-driven eyebrow motion synthesis with contextual markovian models, ICASSP, Canada, May 2013, pp. 3756-3760.

Demo of Nao with Expressive Gesture

Within the French project ANR GV-LeX, we have developed an expressive communicative gesture model for the humanoid robot Nao. The goal of this project aims at equipping a humanoid robot with a capacity of producing gestures while telling a story for children. To reach this objective, we have extended and developed our existing virtual agent platform GRETA to be adapted to the robot. Gestural prototypes are described symbolically and stored in a gestural database, called lexicon. Given a set of intentions and emotional states to communicate the system selects from the robot lexicon corresponding gestures. After that the selected gestures are planned to synchronize speech and then instantiated in robot joint values while taking into account parameters of gestural expressivity of temporal extension, spatial extension and repetitivity.


Q. A. Le, C. Pelachaud, Evaluating an Expressive Gesture Model for a Humanoid Robot: Experimental Results, submitted to the 8th ACM/IEEE International Conference on Human Robot Interaction (HRI 2013), 3-6 March, 2013, Tokyo, Japan

Q. A. Le, J.-F. Huang, C. Pelachaud, A Common Gesture and Speech Production Framework for Virtual and Physical Agents, accepted to the 14th ACM International Conference on Multimodal Interaction, Workshop on Speech and Gesture Production in Virtually and Physically Embodied Conversational Agents, 2012, Santa Monica, CA, USA.

Le, Q.A., Hanoune, S. and Pelachaud, C. Design and implementation of an expressive gesture model for a humanoid robot. 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2011), Bled, Slovenia on October 26th to 28th, 2011.

Le, Q.A. and Pelachaud, C. Expressive Gesture Model for Humanoid Robot. International Conference of the HUMAINE Association on Affective Computing and Intelligent Interaction (ACII2011), Memphis, Tennessee, USA on October 9th to 12th, 2011.

Le Quoc Anh, Catherine Pelachaud. Generating co-speech gestures for the humanoid robot NAO through BML. Lecture Notes in Computer Science (LNCS) 2011: The 9th International Gesture Workshop. Gesture in Embodied Communication and Human-Computer Interaction, May 25-27, 2011, Athens, Greece.