Publications
Special Section on New Frontiers in Rich Transcription.
(Friedland, G., Fiscus J.., Hain T.., & Furui S.., Ed.).IEEE Transactions in Audio. 20,
(2012). Speaking in Shorthand - A Syllable-Centric Perspective for Understanding Pronunciation Variation.
Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition. 47-56.
(1998). On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings.
Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006). 2014-2017.
(2006).
(2008).
(2005). Speaker Recognition with Session Variability Normalization Based on MLLR Adaptation Transforms.
IEEE Transactions on Audio. 15(7), 1987-1998.
(2007). Speaker Recognition Via Nonlinear Discriminant Features.
Proceedings of the International Speech Communication Association Tutorial and Research Workshop on Non-Linear Speech Processing (NOLISP 2007). 27-30.
(2007).
(2009). Speaker Recognition Using Prosodic and Lexical Features.
Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2003). 19-24.
(2003). Speaker Recognition and Diarization.
115-130.
(2010). Speaker Overlaps and ASR Errors in Meetings: Effects Before, During, and After the Overlap.
Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006). 357-360.
(2006). Speaker Diarization For Multiple-distant-microphone Meetings Using Several Sources of Information.
IEEE Transactions on Computers. 56(9), 1212-1224.
(2007). Speaker Diarization for Multiple Distant Microphone Meetings: Mixing Acoustic Features And Inter-Channel Time Differences.
Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006). 2194-2197.
(2006). Speaker Diarization for Multi-Party Meetings Using Acoustic Fusion.
Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2005). 426-461.
(2005). Speaker Diarization for Multi-Microphone Meetings Using Only Between-Channel Differences.
Proceedings of the Third Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). 257-264.
(2006). Speaker Diarization: A Review of Recent Research.
IEEE Transactions on Audio. 20(2), 356-370.
(2012).
(2011).
(2012). Speaker Detection Without Models.
Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005). 757-760.
(2005). Speaker Adaptation of Language Models for Automatic Dialog Act Segmentation of Meetings.
Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007). 1621-1624.
(2007). Speaker Adaptation of Language and Prosodic Models for Automatic Dialog Act Segmentation of Speech.
Speech Communication. 52(3), 236-245.
(2010).
(1999). Spatial Transformer for 3D Point Clouds.
IEEE Transactions on Pattern Analysis and Machine Intelligence.
(2021).
(2009). Spatial Semantic Regularisation for Large Scale Object Detection.
The IEEE International Conference on Computer Vision (ICCV).
(2015).