Sep 2: Daniel Povey: Applications of Weighted Finite State Transducers in a Speech Recognition Toolkit

CENTER FOR LANGUAGE AND SPEECH PROCESSING

Fall 2011 Seminar Series

Daniel Povey, “Applications of Weighted Finite State Transducers in a Speech Recognition Toolkit”

Microsoft

**FRIDAY**, September 2, 2011, 12:00 p.m.

Hackerman B17

Abstract:
The open-source speech recognition toolkit “Kaldi” uses weighted finite state transducer (WFSTs) for training and decoding, and uses the OpenFst toolkit as a C++ library. I will give an informal overview of WFSTs and of the standard AT&T recipe for WFST based decoding, and will mention some problems (in my opinion) with the basic recipe and how we addressed them while developing Kaldi. I will also describe how to use WFSTs to acheive “exact” lattice generation, in a sense will be explained. This is an interesting application of WFSTs because, unlike most WFST mechanisms, it does not have any obvious non-WFST analog.
Biography:
Daniel Povey received his Bachelor’s (Natural Sciences, 1997), Master’s (Computer Speech and Language Processing, 1998) and PhD (Engineering, 2003) from Cambridge University. He is currently a researcher at Microsoft Research, Redmond, Washington, USA. From 2003 to 2008 he worked as a researcher in IBM Research in Yorktown Heights, NY. He is best known for his work on discriminative training for HMM-GMM based speech recognition (i.e. MMI, MPE, and their feature-space variants).

Comments are closed.