Seminar on Advanced SMT: the Portage Arabic > English and Chinese > English systems at NIST 2012
Title: Advanced SMT: the Portage Arabic > English and Chinese > English systems at NIST 2012
Speaker: Dr. Roland Kuhn
Principal Research Officer
National Research Council Canada
Date: 10 December 2013 (Tuesday)
Time: 2:00pm – 3:00pm
Venue: Rm 513, William M. W. Mong Engineering Building, CUHK
Abstract:
This talk will describe the extremely successful participation of NRC’s Portage system in the NIST 2012 Arabic > English and Chinese > English MT evaluation. The talk will seek to convey the excitement of developing a system that will compete with other world-class systems in a major evaluation. It will focus on four techniques mainly responsible for Portage’s success:
- Batch lattice MIRA
- Discriminative hierarchical reordering
- Multiple phrase pair extraction
- Domain adaptation with linear mixtures.
Finally, the talk will briefly discuss post-2012 research in SMT by the NRC group.
About the speaker:
After studying mathematical biology at the University of Toronto and the University of Chicago (where he explored computer simulation as a tool for studying the evolution of DNA), Roland developed an interest in natural language. In 1993, he received his Ph.D. in Computer Science from McGill University, with a thesis on applying decision trees to the understanding of spoken phrases.
In the course of his research career, Roland has studied a diverse set of problems in natural language processing, including automatic speech recognition, machine dialogue, speaker verification/identification, speech understanding, letter-to-sound systems, phoneme-based topic spotting, and most recently, machine translation. He has contributed new ideas to several of these areas, including the cache language model for speech recognition and eigenvoices for speaker adaptation and speaker verification/identification.
After working at the Centre de recherche informatique de Montréal (CRIM) as both a researcher and a senior researcher between 1992 and 1996, Roland held research and development positions with the Panasonic Speech Technology Laboratory in Santa Barbara, California (October 1996 to June 2004). He joined the National Research Council of Canada (NRC) in 2004. A citizen of Canada and Germany, Roland holds 30 US patents. He was a member of the IEEE Speech Technical Committee from 2002-2004 and he is a frequent reviewer and sometimes editor for journal and conference articles in the areas of machine translation and speech recognition.