Modeling Multimodal Behaviors From Speech Prosody

To synthesize the head and eyebrow movements of virtual characters, we have developed a fully parameterized Hidden Markov Model (FPHMM), an extension of a Contextual HMM (CHMM). We have first learned a FPHMM that takes speech features as contextual variables and that produces motion features observation. During the training phase, motion and speech streams are both used to learn a FPHMM. During the motion synthesis phase (i.e. the animation generation phase), only the speech stream is known.


YouTube  


YouTube  

Y. Ding, C. Pelachaud and T. Artières, Modeling Multimodal Behaviors from Speech Prosody, Intelligent Virtual Agents, August 2013, vol. 8108, pp. 217-228.

Y. Ding, M. Radenen, T. Artières and C. Pelachaud, Speech-driven eyebrow motion synthesis with contextual markovian models, ICASSP, Canada, May 2013, pp. 3756-3760.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>