Seminar on Current Automatic Speech Recognition Methodology from a Critical Viewpoint

  • Posted on: 16 May 2014
  • By: hadmin

Department of Systems Engineering and Engineering Management
CUHK MoE-Microsoft Key Laboratory of Human-Centric Computing and Interface Technologies


Title: Current Automatic Speech Recognition (ASR) Methodology from a Critical Viewpoint
Speaker: Professor Douglas O'Shaughnessy
                  Professor, Institut National de la Recherche Scientifique
                  Adjunct Professor, McGill University
Date: 12 April 2012
Time: 4:30 - 5:30 pm
Venue: ERB513

How is ASR done currently? What are its techniques and why were they chosen?  What is good and not so good about the methodology choices that have been made over ASR's recent history?  How do we handle variability in both timing and spectra, while exploiting aspects of language modeling?  How might we do ASR better from a scientific viewpoint of its objectives?  What are we trying to accomplish, how can we better exploit what human speakers are doing in their natural speech production and what listeners do in audition, while being efficient in algorithmic cost?  What changes in methods we see in the future?

DOUGLAS O'SHAUGHNESSY (MIT, Ph.D., 1976) has been professor at INRS and adjunct professor at McGill University since 1977. He is a Fellow of the Acoustical Society of America (1992) and of IEEE (2006). He has served as Associate Editor for the IEEE
Transactions on Speech and Audio Processing and JASA (J. Acoustical Society of America). He is the founding Editor-in-Chief of the EURASIP Journal on Audio, Speech, and Music Processing. He was recently elected as Vice-President of the International Speech
Communication Association (ISCA). He is the Vice-Chair of the Speech and Language Technical Committee of the IEEE Signal Processing Society.
He has presented tutorials on speech recognition at ICASSP-96, ICASSP-2001, at ICC-2003, and at ICASSP-09. He is the author of the textbook Speech Communications: Human and Machine (1986 Addison-Wesley; revised 2000, IEEE Press). In 2003, with Li Deng, he coauthored the book Speech Processing: A Dynamic and Optimization-Oriented Approach (Marcel Dekker).


Thursday, April 12, 2012 - 16:30 to 17:30