UCLA Speech Processing and Auditory Perception Laboratory

 

Publications & Dissertations

 
[2021] [2020] [2019] [2018] [2017] [2016] [2015] [2014] [2013] [2012] [2011] [2010] [2009] [2008] [2007] [2006] [2005] [2004] [2003] [2002] [2001] [2000] [1999] [1998] [1997] [1996] [1995] [1994] [before 1994]

  [Ph.D. Dissertations] [M.S. Theses] [Published Abstracts]

COPYRIGHT NOTICE
Copyright and all rights therein for the documents available in this webpage are maintained by the authors or by other copyright holders. The documents made available here are purely meant for ensuring timely dissemination of scholarly and technical work on a non-commercial basis. It is understood that all persons accessing, storing or copying the information in any of these documents will adhere to the terms and constraints invoked by each copyright holder. These works may not be reposted without the explicit permission of the copyright holder.

2024 ↑Top

Natarajan Balaji Shankar, Amber Afshan, Alexander Johnson, Aurosweta Mahapatra, Alejandra Martin, Haolun Ni, Hae Won Park, Marlen Quintero Perez, Gary Yeung, Alison Bailey, Cynthia Breazeal and Abeer Alwan. "The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environment", JASA Express Lett. 1 November 2024; 4 (11): 115201, https://doi.org/10.1121/10.0034195

Jinhan Wang, Vijay Ravi, Jonathan Flint and Abeer Alwan. "Speechformer-CTC: Sequential modeling of depression detection with speech temporal classification", Speech Communication (2024): 103106, https://doi.org/10.1016/j.specom.2024.103106

Ruchao Fan, Natarajan Balaji Shankar, and Abeer Alwan. "Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models", Proc. Interspeech 2024, 5173-5177, https://doi.org/10.21437/Interspeech.2024-1353

Hariram Veeramani, Surendrabikram Thapa, Natarajan Balaji Shankar, and Abeer Alwan. "Large Language Model-based Pipeline for Item Difficulty and Response Time Estimation for Educational Assessments". NAACL 2024 Workshops - Proceedings of the Nineteenth Workshop on Innovative Use of NLP for Building Educational Applications (BEA), 561–566.

Alexander Johnson, Natarajan Balaji Shankar, Mari Ostendorf, and Abeer Alwan "An Exploratory Study on Dialect Density Estimation for Children and Adult's African American English", Journal of the Acoustical Society of America, 2024, 155 (4), pp. 2836-2848, https://doi.org/10.1121/10.0025771

Natarajan Balaji Shankar, Ruchao Fan and Abeer Alwan, "SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR" 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Seoul, Korea, Republic of, 2024, pp. 560-564, doi: 10.1109/ICASSPW62465.2024.10625884.

Alexander Johnson, Christina Chance, Kaycee Stiemke, Hariram Veeramani, Natarajan Balaji Shankar, and Abeer Alwan, "An Analysis of Large Language Models for African American English Speaking Children's Oral Language Assessment," Journal of Black Excellence in Engineering Science, and Technology, Vol 1, 2023

Ruchao Fan, Natarajan Balaji Shankar, and Abeer Alwan, "UniEnc-CASSNAT: An Encoder Only Non-autoregressive ASR with Self-supervised Pretrained Speech Models," IEEE Signal Processing Letters, vol. 31, pp. 711-715, 2024, doi: 10.1109/LSP.2024.3365036.

Vijay Ravi, Jinhan Wang, Jonathan Flint and Abeer Alwan, "Enhancing Accuracy and Privacy in Speech-Based Depression Detection through Speaker Disentanglemen," Computer Speech & Language, Volume 86, 2024, 101605, ISSN 0885-2308, https://doi.org/10.1016/j.csl.2023.101605.

Vijay Ravi, Jinhan Wang, Jonathan Flint and Abeer Alwan, "A Privacy-Preserving Unsupervised Speaker Disentanglement Method for Depression Detection from Speech," Machine Learning for Cognitive and Mental Health Workshop (ML4CMH), AAAI 2024, Vancouver, BC, Canada

Natarajan Balaji Shankar, Alexander Johnson, Christina Chance, Hariram Veeramani and Abeer Alwan, "CORAAL QA: A Dataset and Framework for Open Domain Spontaneous Speech Question Answering from Long Audio Files," ICASSP 2024, pp. 13371-13375, doi: 10.1109/ICASSP48485.2024.10447109.


2023 ↑Top

Hariram Veeramani, Natarajan Balaji Shankar, Alexander Johnson and Abeer Alwan, "Towards Automatically Assessing Children's Oral Picture Description Tasks," Proc. 9th Workshop on Speech and Language Technology in Education (SLaTE), 119–120.

Alexander Johnson, Hariram Veeramani, Natarajan Balaji Shankar, and Abeer Alwan, "An Equitable Framework for Automatically Assessing Children's Oral Narrative Language Abilities," Proc. INTERSPEECH 2023, 4608-4612, doi: 10.21437/Interspeech.2023-1257

Vishwas M. Shetty, Steven M Lulich, Pertti Palo, Abeer Alwan, "Developmental Articulatory and Acoustic Features for Six to Ten Year Old Children," Proc. INTERSPEECH 2023, 4598-4602, doi: 10.21437/Interspeech.2023-2236

Eray Eran, Lee Ngee Tan and Abeer Alwan, "FusedF0: Improving DNN-based F0 Estimation by Fusion of Summary-Correlograms and Raw Waveform Representations of Speech Signals," Proc. INTERSPEECH 2023, 4523-4527, doi: 10.21437/Interspeech.2023-2229

Jinhan Wang, Vijay Ravi, and Abeer Alwan, "Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals," Proc. INTERSPEECH 2023, 2343-2347, doi: 10.21437/Interspeech.2023-2101

A. Johnson, V. Shetty, M. Ostendorf, and A. Alwan, "Leveraging Multiple Sources in Automatic African American English Dialect Detection for Adults and Children," in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10096614

A. Johnson, J. Washington, R. Morris, M. Ostendorf, A. Bailey, and A. Alwan "Towards Effective Speech-based AI in the Classroom: The Case of AAE-Speaking Children," in Black in AI Workshop at NeurIPs 2023

Ruchao Fan, Wei Chu, Peng Chang, and Abeer Alwan "A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1436-1448, 2023, doi: 10.1109/TASLP.2023.3263789.


2022 ↑Top

Vishwas Shetty, Steven M. Lulich, Pertti Palo, Abeer Alwan, "Development of vowel acoustics and subglottal resonances in American English-speaking children: A longitudinal Study," The Journal of the Acoustical Society of America 152, A286 (2022)

R. Fan, Y. Zhu, J. Wang, and A. Alwan, "Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR," in IEEE Journal of Selected Topics in Signal Processing, vol. 16, no.6, pp. 1242-1252, Oct. 2022, doi: 10.1109/JSTSP.2022.3200910

Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan, "A Step Towards Preserving Speakers' Identity While Detecting Depression Via Speaker Disentanglement," in Interspeech 2022, 3338-3342, doi: 10.21437/Interspeech.2022-10798

Jinhan Wang, Vijay Ravi, Jonathan Flint, Abeer Alwan, "Unsupervised Instance Discriminative Learning for Depression Detection from Speech Signals," in Interspeech 2022, 2018-2022, doi: 10.21437/Interspeech.2022-10814

A. Johnson, K. Everson, V. Ravi, A. Gladney, M. Ostendorf, and A. Alwan, "Automatic Dialect Density Estimation for African American English," in Interspeech 2022, 1283-1287, doi: 10.21437/Interspeech.2022-796

Amber Afshan, Abeer Alwan, "Learning from Human Perception to Improve Automatic Speaker Verification in Style-mismatched Conditions," in Interspeech 2022, 2338-2342, doi: 10.21437/Interspeech.2022-883

Amber Afshan, Abeer Alwan, "Attention-based Conditioning Methods using Variable Frame Rate for Stype-robust Speaker Verification," in Interspeech 2022, 2333-2337, doi: 10.21437/Interspeech.2022-882

Ruchao Fan, Abeer Alwan, "DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR," in Interspeech 2022, 4900-4904, doi: 10.21437/Interspeech.2022-11128

A. Johnson, A. Martin, M. Quintero, A. Bailey, and A. Alwan, "Can Social Robots Effectively Elicit Curiosity in STEM Topics from K-1 Students During Oral Assessments?" 2022 IEEE Global Engineering Education Conference (EDUCON), 2022, pp. 1264-1268, doi: 10.1109/EDUCON52537.2022.9766662.

Matthew Marge, Carol Espy-Wilson, Nigel G. Ward, Abeer Alwan, Yoav Artzi, Mohit Bansal, Gil Blankenship, Joyce Chai, Hal Daumé, Debadeepta Dey, Mary Harper, Thomas Howard, Casey Kennington, Ivana Kruijff-Korbayová, Dinesh Manocha, Cynthia Matuszek, Ross Mead, Raymond Mooney, Roger K. Moore, Mari Ostendorf, Heather Pon-Barry, Alexander I. Rudnicky, Matthias Scheutz, Robert St. Amant, Tong Sun, Stefanie Tellex, David Traum, Zhou Yu, "Spoken language interaction with robots: Recommendations for future research," Computer Speech & Language (71), pp. 101255, 2022, doi: 10.1016/j.csl.2021.101255.

A. Johnson, R. Fan, R. Morris, and A. Alwan, "LPC AUGMENT: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children’s Dialects," in ICASSP 2022, doi:https://doi.org/10.1109/ICASSP43922.2022.9746281, page 8577-8581

Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan, "FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals," in ICASSP 2022, doi: 10.1109/ICASSP43922.2022.9746307, page 6267-6271

Yunzheng Zhu, Ruchao Fan, and Abeer Alwan, "Towards Better Meta-Initialization with Task Augmentation for Kindergarten-aged Speech Recognition," in ICASSP 2022, doi:https://doi.org/10.1109/ICASSP43922.2022.9747599, page 8582-8586

Amber Afshan, Jody Kreiman, and Abeer Alwan, "Speaker discrimination performance for “easy” versus “hard” voices in style-matched and -mismatched speech," The Journal of the Acoustical Society of America, 151(2):1393–1403, 2022, doi:https://doi.org/https://doi.org/10.1121/10.0009585

Steven M. Lulich, Abeer Alwan, Mitchell S. Sommers, Gary Yeung, "The Child Subglottal Resonances Database" LDC Catalog No.: LDC2022S02, ISBN: 1-58563-985-0, DOI: https://doi.org/10.35111/75r1-yj93," .


2021 ↑Top

Gary Yeung, Ruchao Fan, and Abeer Alwan, "Fundamental frequency feature warpingfor frequency normalization and data augmentation in child automatic speech recognition," Speech Communication (2021), doi: https://doi.org/10.1016/j.specom.2021.08.002.

Jinhan Wang, Yunzheng Zhu, Ruchao Fan, Wei Chu, and Abeer Alwan, "Low Resource German ASR with Untranscribed Data Spoken by Non-native Children – INTERSPEECH 2021 Shared Task SPAPL System," Proc. of Interspeech 2021, pp. 1279-1283, doi: 10.21437/Interspeech.2021-1974.

Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, and Abeer Alwan, "An Improved Single Step Non-autoregressive Transformer for Automatic Speech Recognition," Proc. Interspeech 2021, pp. 3715-3719, doi: 10.21437/Interspeech.2021-1955.

Ruchao Fan, Amber Afshan, and Abeer Alwan, "BI-APC: Bidirectional autoregressive predictive coding for unsupervised pre-training and its application to children’s ASR," ICASSP, 2021, pp. 7023-7027, DOI: 10.1109/ICASSP39728.2021.9414970.

Gary Yeung, Ruchao Fan, and Abeer Alwan, "Fundamental frequency feature normalization and data augmentation for child speech recognition," ICASSP, 2021, pp. 6993-6997, DOI: 10.1109/ICASSP39728.2021.9413801.


2020 ↑Top

Trang Tran, Morgan Tinkler, Gary Yeung, Abeer Alwan, and Mari Ostendorf, "Analysis of Disfluency in Children's Speech", Proc. Interspeech 2020, pp. 4278-4282, DOI: 10.21437/Interspeech.2020-3037.

Vijay Ravi, Ruchao Fan, Amber Afshan, Huanhua Lu, and Abeer Alwan, "Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification", Proc. Interspeech 2020, pp. 766-770, DOI: 10.21437/Interspeech.2020-2957.

Amber Afshan, Jody Kreiman, and Abeer Alwan, "Speaker discrimination in humans and machines: Effects of speaking style variability", Proc. Interspeech 2020, pp. 3136-3140, DOI: 10.21437/Interspeech.2020-3004.

Amber Afshan, Jinxi Guo, Soo Jin Park, Vijay Ravi, Alan McCree, and Abeer Alwan, "Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification", Proc. Interspeech 2020, pp. 4318-4322, DOI: 10.21437/Interspeech.2020-3006.

Joanna J. Parga, Sharon Lewin, Juanita Lewis, Diana Montoya-Williams, Abeer Alwan, Brianna Shaul, Carol Han, Susan Y. Bookheimer, Sherry Eyer, Mirella Dapretto, Lonnie Zeltzer, Lauren Dunlap, Usha Nookala, Daniel Sun, Bianca H. Dang, Ariana E. Anderson, "Defining and Distinguishing Infant Behavioral States Using Acoustic Cry Analysis: Is Colic Painful?", Pediatric Research volume 87, pages 576–580, 2020


2019 ↑Top

Patricia Keating, Jody Kreiman, and Abeer Alwan, "A new speech database for within- and between-speaker variability", In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.) Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 736-739). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

Gary Yeung, Alison L. Bailey, Amber Afshan, Morgan Tinkler, Marlen Q. Pérez, Alejandra Martin, Anahit A. Pogossian, Samuel Spaulding, Hae Won Park, Manushaqe Muco, Abeer Alwan and Cynthia Breazeal, "A robotic interface for the administration of language, literacy, and speech pathology assessments for children", SLATE, 2019, pp. 41-42.

Gary Yeung, and Abeer Alwan "A Frequency Normalization Technique for Kindergarten Speech Recognition Inspired by the Role of F0 in Vowel Perception", Interspeech, 2019, pp. 6-10, DOI: 10.21437/Interspeech.2019-1847

Vijay Ravi, Soo Jin Park, Amber Afshan, and Abeer Alwan "Voice Quality and Between-Frame Entropy for Sleepiness Estimation", Interspeech, 2019, pp. 2408-2412, DOI: 10.21437/Interspeech.2019-2988

Gary Yeung, Alison L. Bailey, Amber Afshan, Marlen Q. Pérez, Alejandra Martin, Samuel Spaulding, Hae Won Park, Abeer Alwan and Cynthia Breazeal "Towards the Development of Personalized Learning Companion Robots for Early Speech and Language Assessment", AERA, 2019, DOI: 10.302/1431402

Soo Jin Park, Amber Afshan, Jody Kreiman, Gary Yeung and Abeer Alwan "Target and Non-target Speaker Discrimination by Humans and Machines", ICASSP, 2019, pp. 6326-6330, DOI: 10.1109/ICASSP.2019.8683362


2018 ↑Top

Jinxi Guo, Ning Xu, Kailun Qian, Yang Shi, Kaiyuan Xu, Yingnian Wu, and Abeer Alwan "Deep neural network based i-vector mapping for speaker verification using short utterances", , Speech Communication, vol. 105, 92-102, 2018, https://doi.org/10.1016/j.specom.2018.10.004

Gary Yeung, Steven M. Lulich, Jinxi Guo, Mitchell S. Sommers, and Abeer Alwan "Subglottal resonances of American English speaking children", The Journal of the Acoustical Society of America 144 (6), 3437-3449, 2018, https://doi.org/10.1121/1.5082289

Soo Jin Park, Gary Yeung, Neda Vesselinova, Jody Kreiman, Patricia Keating, and Abeer Alwan "Understanding Speaker Discrimination Abilities in Humans and Machines for Text-Independent Short Utterances of Different Speech Styles", The Journal of the Acoustical Society of America, 144(1), 375-386, 2018, https://doi.org/10.1121/1.5045323

Amber Afshan, Jinxi Guo, Soo Jin Park, Vijay Ravi, Jonathan Flint, and Abeer Alwan "Effectiveness of Voice Quality in Detecting Depression", in Proc. Interspeech 2018, pp. 1676-1680, DOI: 10.21437/Interspeech.2018-1399

Gary Yeung and Abeer Alwan "On the Difficulties of Automatic Speech Recoginition for Kindergarten-Aged Children", in Proc. Interspeech 2018, pp. 1661-1665, DOI: 10.21437/Interspeech.2018-2297

Jinxi Guo, Ning Xu, Xin Chen, Yang Shi, Kaiyuan Xu, and Abeer Alwan "Filter Sampling and Combination CNN (FSC-CNN): a Compact CNN Model for Small-footprint ASR Acoustic Modeling Using Raw Waveforms", in Proc. Interspeech 2018, pp. 3713-3717, DOI: 10.21437/Interspeech.2018-1370

Soo Jin Park, Amber Afshan, Zhi Ming Chua, and Abeer Alwan "Using Voice Quality Supervectors for Affect Identification", in Proc. Interspeech 2018, pp. 157-161, DOI: 10.21437/Interspeech.2018-1401



2017 ↑Top

V. Mitra, H. Franco, R. Stern, J. van Hout, L. Ferrer, M. Garciarena, W. Wang, D. Vergyri, A. Alwan, and J. Hansen, "Robust Features in Deep-Learning-Based Speech Recognition" a chapter in the book: New Era for Robust Speech Recognition: Exploiting Deep Learning
Edited by Watanabe, Delcroix, Metze, and Hershey. Springer, 2017

Gary Yeung, Amber Afshan, Kaan Ege Ozgun, Kantapon Kaewtip, Steven M. Lulich, and Abeer Alwan, "Predicting Clinical Evaluations of Children’s Speech with Limited Data Using Exemplar Word Template References" in Proc. SLATE 2017, pp. 161-166

Jinxi Guo, Ning Xu, Li-Jia Li and Abeer Alwan, "Attention based CLDNNs for short-duration acoustic scene classification" in Proc. Interspeech 2017, pp. 469-473

Jinxi Guo, Usha Nookala and Abeer Alwan, "CNN-based joint mapping of short and long utterance i-vectors for speaker verification using short utterances" Interspeech 2017, pp. 3712-3716

Soo Jin Park, Gary Yeung, Jody Kreiman, Patricia Keating, and Abeer Alwan, “Using Voice Quality Features to Improve Short-Utterance Text-Independent Speaker Verification,” in Proc. Interspeech 2017, pp. 1522-1526

Jinxi Guo, Ruochen Yang, Harish Arsikere, and Abeer Alwan, "Robust speaker identification via fusion of subglottal resonances and cepstral features", The Journal of the Acoustical Society of America, 141(4), EL420, April 2017



2016 ↑Top

Kantapon Kaewtip, Abeer Alwan, Colm O'Reilly, and Charles E. Taylor, A robust automatic birdsong phrase classification: A template-based approach, The Journal of the Acoustical Society of America, 140(5), 3691-3701

Park, Soo Jin, Caroline Sigouin, Jody Kreiman, Patricia Keating, Jinxi Guo, Gary Yeung, Fang-Yu Kuo, and Abeer Alwan Speaker Identity and Voice Quality: Modeling Human Responses and Automatic Speaker Recognition (Interspeech 2016). pp 1044–1048

Mitra, V., VanHout, J., Wang, W., Bartels, C., Franco, H., Vergyri, D., ... & Sangwan, A. Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel-and Noise-Degraded Speech. (Interspeech 2016), pp 3683-3687.

Kaewtip, K., Taylor, C., & Alwan, A. (2016). Noise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification. (Interspeech 2016), pp 2587-2591.

Guo, J., Yeung, G., Muralidharan, D., Arsikere, H., Afshan, A., & Alwan, A. Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features (Interspeech 2016), pp 2219-2222


2015 ↑Top

Tan, L. N., Alwan, A., Kossan, G., Cody, M. L., & Taylor, C. E. (2015). Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data. The Journal of the Acoustical Society of America, 137(3), 1069-1080.

Jinxi Guo, Rohit Paturi, Gary Yeung, Steven Lulich, Harish Arsikere, and Abeer Alwan Age-dependent height estimation and speaker normalization for children's speech using the first three subglottal resonances (Interspeech 2015). pp 1665-1669

J. Kreiman, M. Garellek, G. Chen, A. Alwan and B. R. Gerratt Perceptual evaluation of voice source models Journal of the Acoustical Society of America, Vol. 138 (1), July 2015 pp. 1 - 10

Jody Kreiman, Soo Jin Park, Patricia Keating, and Abeer Alwan The Relationship Between Acoustic and Perceived Intraspeaker Variability in Voice Quality (Interspeech 2015), pp. 2357-2360

Kantapon Kaewtip, Lee Ngee Tan, Abeer Alwan, and Charles Taylor, Bird-Phase Segmentation and Verification: a Noise Robust Template-Based Approach (ICASSP 2015) pp. 758 - 762

Abeer Alwan, Steven M. Lulich, Mitchell S. Sommers, "The Subglottal Resonances Database" LDC Catalog No.: LDC2015S03, ISBN: 1-58563-711-4, DOI: https://doi.org/10.35111/5wf0-c349, April 20, 2015

2014 ↑Top

Jinxi Guo, Angli Liu, Harish Arsikere, Abeer Alwan and Steven M. Lulich," The relationship between the second subglottal resonance and vowel class, standing height, trunk length, and F0 variation for Mandarin speakers" , Interspeech 2014. pp, 930-934

Harish Arsikere, H.A. Gupta and Abeer Alwan, "Speake recognition via fusion of subglottal features and MFCCs" , Interspeech 2014. pp 1106-1110

Gang Chen, Soo Jin Park, Jody kreiman and Abeer Alwan, "Investigating the effect of F0 and vocal intensity on harmonic magnitudes: Data from high-speed laryngeal videoendoscopy" , Interspeech 2014. pp 1668-1672 Best student paper award finalist

Gang Chen, Jody Kreiman, Abeer Alwan, " The glottaltopogram: a method of analyzing high-speed images of the vocal folds", Computer Speech and Language 28 (2014) pp 1156-1169. [link to the journal article]

Abeer Alwan, Stefanie Shattuck-Hufnagel , and Maria-Gabriella DiBenedetto, "Obituary for Ken Stevens", Physics Today,  April 2014.

Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana, "Glottal Source Processing: from Analysis to Applications", Computer Speech and Language, Special Issue on Glottal Source Processing 28 (5), 1117-1138, 2014.

Harish Arsikere and Abeer Alwan, "Frequency warping using subglottal resonances: complementarity with VTLN and robustness to additive noise", ICASSP 2014, pp 6354-6358.

H.A. Gupta, A. Raju and A. Alwan, "Non-Linear Dimension Reduction of Gabor Features for Noise-Robust ASR", ICASSP 2014, pp 1715-1719.

L. N. Tan and A. Alwan, "Feature Enhancement using Sparse Reference and Estimated Soft-Mask Exemplar-Pairs for Noisy Speech Recognition", ICASSP 2014, pp.1729-1733.

2013 ↑Top

Harish Arsikere, Steven M. Lulich and Abeer Alwan, " Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs," IEEE Signal Processing Letters, Vol 21, Issue 2, 2013, pp. 159--162.

M. Graciarena, A. Alwan, D. Ellis, H.Franco, L. Ferrer, J. Hansen, A. Janin, B.-S. Lee, Y. Lei, V. Mitra, N. Morgan, S. O. Sadjadi, T.J. Tsai, N. Scheffer, L. N. Tan, B. Williams, "All for One: Feature Combination for Highly Channel-Degraded Speech Activity Detection", Interspeech, Lyon, 2013, pp. 709-713.

G. Chen, M. Garellek, J. Kreiman, B. R. Gerratt, A. Alwan, " A perceptually and physiologically motivated voice source model. pp 2001-2005", Interspeech 2013, pp. 2001-2005. [Best student paper award finalist] [slides and audio samples]

G. Chen, R. A. Samlan, J. Kreiman, A. Alwan, "Investigating the relationship between glottal area waveform shape and harmonic magnitudes through computational modeling and laryngeal high-speed videoendoscopy", Interspeech 2013, pp. 3216-3220. [poster]

Kantapon Kaewtip, Lee Ngee Tan, Abeer Alwan, " A Pitch-Based Spectral Enhancement Technique for Robust Speech Processing", Interspeech 2013, pp. 3284-3288.

Kantapon Kaewtip, Lee Ngee Tan, Abeer Alwan, Charles E.Taylor, "A robust automatic bird phrase classifier using dynamic time-warping with prominent region identification", ICASSP 2013, pp. 768-772.

Harish Arsikere, Steven M. Lulich and Abeer Alwan, " Non-linear frequency warping for VTLN using subglottal resonances and the third formant frequency," ICASSP 2013, pp. 7922-7926, 

L. N. Tan, and Abeer Alwan, " Multi-Band Summary Correlogram-based Pitch Detection for Noisy Speech", Speech Communication, Volume 55, Issues 7–8, September 2013, pp. 841-856. [link to the journal article[Matlab code of MBSC pitch detector]

L. N. Tan, G. Kossan, M. L. Cody, C. E. Taylor, A. Alwan, " A Sparse Representation-based Classifier for In-set Bird Phrase Verification and Classification with Limited Training Data," ICASSP 2013, pp. 763-767.

Gang Chen, Jody Kreiman, Bruce Gerratt, Juergen Neubauer, Yen-Liang Shue, and Abeer Alwan, "Development of a glottal area index that integrates glottal gap size and open quotient," Journal of the Acoustical Society of America, Vol. 133, Issue 3, March 2013, pp. 1656–1666. [link to the journal article]

Harish Arsikere, Gary K.F. Leung, Steven M. Lulich, and Abeer Alwan,"Automatic estimation of the first three subglottal resonances from adults’ speech signals with application to speaker height estimation," Speech Communication, Vol. 55, pp. 51-70, 2013. [link to the journal article]


2012 ↑Top

Steven M. Lulich, John R. Morton, Harish Arsikere, Mitchell Sommers, Gary K. F. Leung, and Abeer Alwan, "Subglottal resonances of adult male and female native speakers of American English," Journal of the Acoustical Society of America, Volume 132, Issue 4, pp. 2592-2602 (2012).

Jody Kreiman, Yen-Liang Shue, Gang Chen, Markus Iseli, Bruce R. Gerratt, Juergen Neubauer, and Abeer Alwan, "Relationships among voice quality, harmonic amplitudes, open quotient, and glottal area waveform shape in sustained phonation," Journal of the Acoustical Society of America, Volume 132, Issue 4, pp. 2625-2632 (2012). [link to the journal article]

Jody Kreiman, Marc Garellek, Gang Chen, Abeer Alwan, and Bruce R.Gerratt, "Perceptual evaluation of source models," International Conference on Voice Physiology and Biomechanics, 2012

Harish Arsikere, Gary K.F. Leung, Steven M. Lulich and Abeer Alwan, "Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions," Interspeech 2012. pp 4616-4619

Gang Chen, Yen-Liang Shue, Jody Kreiman, and Abeer Alwan, " Estimating the voice source in noise", Interspeech 2012, pp. 1600-1603..

L. N. Tan, K. Kaewtip, M. L. Cody, C. E. Taylor, and A. Alwan, "Evaluation of a Sparse Representation-Based Classifier For Bird Phrase Classification Under Limited Data Conditions", Interspeech 2012. pp 2522-2525

Wei Chu and Abeer Alwan, "FBEM: A Filter Bank EM Algorithm for the Joint Optimization Of Features and Acoustic Model Parameters In Bird Call Classification", ICASSP 2012, pp. 1993-1996.

Gang Chen, Jody Kreiman, and Abeer Alwan, " The Glottaltopograph: A Method of Analyzing High-Speed Images of the Vocal Folds", ICASSP 2012, pp. 3985-3988.

Harish Arsikere, Gary K.F. Leung, Steven M. Lulich and Abeer Alwan, "Automatic height estimation using the second subglottal resonance", ICASSP 2012, pp. 3989-3992.

Julien van Hout and Abeer Alwan, "A Novel Approach to Soft-Mask Estimation and Log-Spectral Enhancement For Robust Speech Recognition", ICASSP 2012, pp. 4105-4108.

W. Chu and Abeer Alwan, "SAFE: A Statistical Approach to F0 Estimation under Clean and Noisy Conditions," IEEE Trans. on Audio, Speech, and Language Processing, Volume 20, No. 3, pp. 933 - 944, March 2012.


2011 ↑Top


S. Lulich, A. Alwan, H. Arsikere, J. Morton, and, M. Sommers, "Resonances and wave propagation velocity in the subglottal airways", Journal of the Acoustical Society of America, Volume 130, Issue 4, pp. 2108-2115, 2011.

B. J. Borgstrom and A. Alwan, "A Unified Framework for Designing Optimal STSA Estimators Assuming Additive Superposition of Speech and Noise", IEEE Trans. on Audio, Speech, and Language Processing,  Vol. 19, No. 8, pp 2579-2590, Nov. 2011.

T. Drugman and A. Alwan, "Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics," Interspeech 2011, pp 1973-1976

G. Chen, J. Kreiman, Yen-Liang Shue, and A. Alwan, "Acoustic Correlates of Glottal Gaps," Interspeech 2011, pp 2673-2676

S. Lulich, H. Arsikere, J. Morton, G. Leung, A. Alwan, and M. Sommers, "Analysis and automatic estimation of children's subglottal resonances," Interspeech 2011, pp 2817-2820

Harish Arsikere, Steven Lulich, and Abeer Alwan, "Automatic Estimation of the First Subglottal Resonance," Journal of the Acoustical Society of America (Express Letters), Vol. 129, Issue 5, pp. 197-203, May 2011.

W. Chu, D.T. Blumstein, "Noise robust bird song detection using syllable pattern-based hidden Markov models," ICASSP 2011, pp. 345-348.

Lee Ngee Tan and Abeer Alwan, "Noise-Robust F0 Estimation Using SNR-Weighted Summary Correlograms From Multi-Band Comb Filters," ICASSP 2011, pp. 4464-4467.

Harish Arsikere, Steven Lulich, and Abeer Alwan, "Automatic Estimation of the Second Subglottal Resonance from Natural Speech," ICASSP 2011, 4616 - 4619.

Bengt Borgstrom and Abeer Alwan, "Log-Spectral Amplitude Estimation With Generalized Gamma Distributions For Speech Enhancement," ICASSP 2011, pp. 4756-4759.

A. Alwan, J. Jiang and W. Chen, "Perception of place of articulation for plosives and fricatives in noise," Speech Communication, Vol. 53, Issue 2, pp. 195-209, Feb. 2011.

S. Panchapagesan and A. Alwan, "A study of acoustic-to-articulatory
inversion of speech by analysis-by-synthesis using chain matrices and the
Maeda articulatory model
," J. Acoust. Soc. Am. Volume 129, Issue 4, pp. 2144-2162, 2011.

Joseph Tepperman, Sungbok Lee, Shrikanth (Shri) Narayanan, and Abeer Alwan, "A Generative Student Model for Scoring Word Reading Skills," IEEE Transactions On Audio, Speech, And Language Processing, Vol. 19, No. 2, pp 348-360, February 2011.


2010 ↑Top

B. J. Borgstrom and A. Alwan, "A Statistical Approach to Mel-Domain Mask Estimation for Missing-Feature ASR", IEEE Signal Processing Letters, Vol. 17, No. 11, pp. 941-944, Nov. 2010.

Y.-L. Shue, G. Chen, and A. Alwan, "On the Interdependencies between Voice Quality, Glottal Gaps, and Voice-Source related Acoustic Measures," Interspeech 2010, pp. 34-37.

G. Chen, X. Feng, Y.-L. Shue, and A. Alwan, "On Using Voice Source Measures in Automatic Gender Classification of Children's Speech," Interspeech 2010, pp. 673-676.

B. J. Borgstrom, P. H. Borgstrom, and A. Alwan, "Efficient HMM-Based Estimation of Missing Features, with Applications to Packet Loss Concealment," Interspeech 2010, pp. 2394-2397.

W. Chu and A. Alwan, "SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech," Interspeech 2010, pp. 2590-2593. [slides]

Y.-L. Shue and A. Alwan, "A new voice source model based on high-speed imaging and its application to voice source estimation," ICASSP 2010, pp. 5134-5137.

L. N. Tan, B. J. Borgstrom and A. Alwan, "Voice Activity Detection using Harmonic Frequency Components in Likelihood Ratio Test," ICASSP 2010, pp. 4466-4469. [Matlab code] [Speech/Non-speech label files for Aurora 2]

B. J. Borgstrom and A. Alwan, "HMM-Based Reconstruction of Unreliable Spectrographic Data for Noise Robust Speech Recognition", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, No. 5, pp 1612-1623, July 2010.

B. J. Borgstrom and A. Alwan, "Improved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR," Journal of Selected Topics in Signal Processing, Vol. 4, No. 5, 2010, pp. 808-815.

Y. Shue, S. Shattuck-Hufnagel, M. Iseli, S. Jun, N. Veilleux, and A. Alwan, "On the acoustic correlates of high and low nuclear pitch accents in American English,'' Speech Communication, 2010, Vol 52, No. 2, pp. 106-122.


2009 ↑Top

S. Wang, S. Lulich, and A. Alwan, "Automatic detection of the second subglottal resonance and its application to speaker normalization," J. Acoust. Soc. Am, 2009. Volume 126, Issue 6, pp. 3268-3277.

R. Scarborough, P. Keating, S. Mattys, T. Cho, and A. Alwan, "Optical Phonetics and Visual Perception of Lexical and Phrasal Stress in English,'' Language and Speech, 2009, Vol. 52, No. 2-3, 135-175.

P. Price, J. Tepperman, M. Iseli, T. Duong, M. Black, S. Wang, C. K. Boscardin, M. Heritage, P. D. Pearson, S. Narayanan, and A. Alwan, "Assessment of emerging reading skills in young native speakers and language learners," Speech Communication, Volume 51, Issue 10, October 2009, pp. 968-984.

S. Wang, P. Price, Y.-H. Lee and A. Alwan, "Measuring children's phonemic awareness through blending tasks," SLaTE workshop 2009. pp 101-104

H. You and A. Alwan, "Temporal Modulation Processing of Speech Signals for Noise Robust ASR," Interspeech 2009, pp. 36-39.

Y.-L. Shue, J. Kreiman, and A. Alwan, "A Novel Codebook Search Technique for Estimating the Open Quotient," Interspeech 2009, pp. 2895-2898.

S. Wang, Y.-H. Lee and  A. Alwan, "Bark-shift based nonlinear speaker normalization using the second subglottal resonance," Interspeech 2009, pp. 1619-1622.

W. Chu and A. Alwan, "A Correlation-Maximization Denoising Filter Used as an Enhancement Frontend for Noise Robust Bird Call Classification," InterSpeech 2009, pp. 2831-2834. [slides]

V. Mitra, B. Borgstrom, C. Espy-Wilson, and A. Alwan, "A Noise-type and level-dependent MPO-based speech enhancement architecture," InterSpeech 2009, pp. 2751-2754.

B. J. Borgstrom and A. Alwan, "Missing Feature Imputation of Log-Spectral Data For Noise Robust ASR ," to appear, Workshop on DSP in Mobile and Vehicular Systems, 2009.

B. J. Borgstrom and A. Alwan, "Utilizing Compressibility in Reconstructing Spectrographic Data, with
Applications to Noise Robust ASR
," IEEE Signal Processing Letters, Vol. 16, Issue 5, pp. 398-401, 2009.

W. Chu and A. Alwan, "Reducing F0 Frame Error of F0 Tracking Algorithms Under Noisy Conditions with an Unvoiced/Voiced Classification Frontend," ICASSP 2009, pp.3969-3972. [slides]

S. Panchapagesan and A. Alwan, "Frequency Warping for VTLN and Speaker Adaptation by Linear Transformation of Standard MFCC," Computer Speech and Language, Vol. 23, Issue 1, pp. 42-64, Jan. 2009.


2008 ↑Top

YL Shue and M Iseli, " The role of voice source measures on automatic gender classification," ICASSP, 2008.

A. Alwan, " Dealing with Limited and Noisy Data in ASR: A Hybrid Knowledge-Based and Statistical Approach," Keynote Speech at Interspeech 2008, pp. 11-15.

S. Panchapagesan and A. Alwan, " Vocal Tract Inversion by Cepstral Analysis-by-Synthesis using Chain Matrices ," Interspeech 2008, pp. 2857-2860.

S. Wang, S.M. Lulich, and A. Alwan, " A reliable technique for detecting the second subglottal resonance and its use in cross-language speaker adaptation ," Interspeech 2008, pp. 1717-1720.

Y. Shue, S. Shattuck-Hufnagel, M. Iseli, S. Jun, N. Veilleux, and A. Alwan, " Effects of Intonational Phrase Boundaries on Pitch-Accented Syllables in American English ," Interspeech 2008, pp. 873-876. The Best Student Paper Award.

B. J. Borgstrom and A. Alwan, " HMM-Based Estimation of Unreliable Spectral Components for Noise Robust Speech Recognition ," Interspeech 2008, pp. 1769-1772.

B. J. Borgstrom, A. Bernard, and A. Alwan, " Error Recovery - Channel Coding and Packetization," Chapter 8 in Automatic Speech Recognition on Mobile Devices and over Communication Networks, Springer-Verlag. Editors: Z.-H. Tan and B. Lindberg, pp. 163-185, 2008.

S. Wang, A. Alwan, and S. Lulich, " Speaker Normalization Based on Subglottal Resonances," ICASSP 2008, pp. 4277-4280.

B. J. Borgstrom and A. Alwan, " An Efficient Approximation of the Forward-Backward Algorithm to Deal With Packet Loss, With Applications to Remote Speech Recognition ," ICASSP 2008, pp. 4425-4428.

B. J. Borgstrom and A. Alwan, " A Low Complexity Parabolic Lip Contour Model With Speaker Normalization For High-Level Feature Extraction in Noise Robust Audio-Visual Speech Recognition", IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, Vol. 38, No. 6, pp. 1273-1280, 2008.


2007 ↑Top

J. Tepperman, M. Black, S. Lee, A. Kazemzadeh, M. Gerosa, M. Heritage, A. Alwan, and S. Narayanan, "A Bayesian Network Classifier for Word-level Reading Assessment,'' InterSpeech 2007, pp. 2185-2188.

R. Scarborough, P. Keating, M. Baroni, T. Cho, S. Mattys, A. Alwan, E. Auer, L.E. Bernstein, "Optical Cues to the Visual Perception of Lexical and Phrasal Stress in English," UCLA Working Papers in Phonetics, no. 105, p.118-124.

S. Wang, P. Price, M. Heritage and A. Alwan, "Automatic Evaluation of Children's Performance on an English Syllable Blending Task", SLaTE workshop 2007. pp 120-123

S. Wang, X. Cui, and A. Alwan, "Speaker Adaptation with Limited Data using Regression-Tree based Spectral Peak Alignment", IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 8, pp. 2454-2464, Nov. 2007.

A. Alwan, Y. Bai, M. Black, L. Casey,M. Gerosa, M. Heritage, M. Iseli, B. Jones, A. Kazemzadeh, S. Lee, S. Narayanan, P. Price, J. Tepperman, and S. Wang, "A System for Technology Based Assessment of Language and Literacy in Young Children: the Role of Multiple Information Sources", IEEE Multimedia Signal Processing Workshop, Oct. 2007, pp. 26-30.

Y. Shue, M. Iseli, N. Veilleux, and A. Alwan "Pitch Accent versus Lexical Stress: Quantifying Acoustic Measures Related to the Voice Source", Proceedings of Interspeech 2007, pp. 2625-2628, Belgium.

B. J. Borgstrom and A. Alwan "A Packetization and Variable Bitrate Interframe Compression Scheme For Vector Quantizer-Based Distributed Speech Recognition,"Proceedings of Interspeech 2007, pp. 578-581, Belgium.

J. Jiang, A. Alwan, E. Auer, P. Keating, and L. Bernstein , "Similarity structure in visual speech perception and optical phonetic signals", Perception and Psychophysics, Vol. 69, No. 7, pp. 1070-1083, October 2007.

X. Cui and A. Alwan, "Robust Speaker Adaptation by Weighted Model Averaging Based on the Minimum Description Length Criterion,'' IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No. 2, pp. 652-660, Feb. 2007.

B. J. Borgstrom, M. van der Schaar, A. Alwan, "Rate Allocation for Non-Collaborative Multi-User Speech Communication Systems Based On Bargaining Theory", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No. 4, pp. 1156-1166, May 2007.

M. Iseli, Y.-L. Shue, A. Alwan, "Age, sex, and vowel dependencies of acoustical measures related to the voice source", Journal of the Acoustic Society of America, Vol. 121, Issue 4, pp. 2283-2295, April 2007.

H. You, A. Alwan, "A Statistical Acoustic Confusability Metric for Hidden Markov Models,'' IEEE ICASSP Proceedings, vol. 4, pp. 745-748, 2007.


2006 ↑Top

R. Scarborough, P. Keating, M. Baroni, T. Cho, S. Mattys, A. Alwan, E. Auer, L.E. Bernstein, "Optical Cues to the Visual Perception of Lexical and Phrasal Stress in English," Proceedings of the 3rd International Conference on Speech Prosody, May 2006.

B. J. Borgstrom, M. van der Schaar, A. Alwan, "Bargaining-Based Rate Allocation for Non-Collaborative Multi-User Speech Communication Systems", SiMPE workshop, 2006.

M. Iseli, Y.-L. Shue, M. Epstein, P. Keating, A. Alwan, "Voice Source Correlates of Prosodic Features in American English: a Pilot Study", Proceedings of ICSLP 2006, pp. 2226-2229.

Shizhen Wang, Xiaodong Cui and Abeer Alwan, "Rapid Speaker Adaptation Using Regression-Tree Based Spectral Peak Alignment", Proceedings of ICSLP 2006, 1479-1482.

S. Panchapagesan, "Frequency Warping by Linear Transformation of Standard MFCC", Proceedings of ICSLP 2006, pp. 397-400.

J. Tepperman, J. Silva, A. Kazemzadeh, H. You, S. Lee, A. Alwan, and S. Narayanan, "Pronunciation Verification of Children's Speech for Automatic Literacy Assessment," Proceedings of ICSLP 2006.

A. Kazemzadeh, J. Tepperman, J. Silva, H. You, S. Lee, A. Alwan, and S. Narayanan, "Automatic Detection of Voice Onset Time Contrasts for Use in Pronunciation Assessment," Proceedings of ICSLP 2006.

J. Xue, B. J. Borgstrom, J. Jiang, L. Bernstein, A. Alwan, "Acoustically-driven Talking Face Synthesis Using Dynamic Bayesian Networks", Proceedings of IEEE ICME 2006, pp. 1165-1168.

S. Panchapagesan and A. Alwan, "Multi-parameter Frequency Warping for VTLN by Gradient Search", IEEE ICASSP Proceedings, May 2006. pp 1181-1184

M. Iseli, Y, Shue, and A. Alwan, "Age- and Gender-Dependent Analysis of Voice Source Characteristics", IEEE ICASSP Proceedings, May 2006. pp 389-392

Li Deng, X. Cui, R. Pruvenok, J. Huang, S. Momen, Y. Chen, and A. Alwan, "A Database of Vocal Tract Resonance Trajectories for Reasearch in Speech Processing",  IEEE ICASSP Proceedings, May 2006. pp 369-372

X. Hu, M. Bergsneider, E. Rubinstein and A. Alwan, "Reduction of Compartment Compliance Increases Venous Flow Pulsatility and Lowers Apparent Vascular Compliance: Implications for Cerebral Blood Flow Hemodynamics,"Medical Engineering and Physics, Vol. 28, Issue 4, May 2006. pp 304-314

X. Cui and A. Alwan, "Adaptation of Children's Speech with Limited Data Based on Formant-like Peak Alignment,"Computer Speech and Language, Vol. 20, Issue 4, pp. 400-419, October 2006.

Jintao Jiang, Marcia Chen, Abeer Alwan, "On the perception of voicing in syllable-initial plosives in noise",Journal of the Acoustical Society of America, Volume 119, Issue 2, pp. 1092-1105, February 2006.


2005 ↑Top

J. Xue, J. Jiang, A. Alwan and L. Bernstein, "Consonant confusion structure based on machine classification of visual features in continuous speech,"Audio-Visual Speech Processing Workshop 2005, Vancouver Island, Canada, pg. 103-108.

H. You, A. Alwan, A. Kazemzadeh and S. Narayanan, "Pronunciation Variation of Spanish-accented English Spoken by Young Children,"Eurospeech 2005, pg. 749-752.

A. Kazemzadeh, H. You, M. Iseli, B. Jones, X. Cui, M. Heritage, P. Price, E. Anderson, S. Narayanan and A. Alwan, "TBALL Data Collection: the Making of a Young Children's Speech Corpus,"Eurospeech 2005, pg. 1581-1584.

X. Cui and A. Alwan, "MLLR-Like Speaker Adaptation Based on Linearization of VTLN with MFCC features,"Eurospeech 2005, pg. 273-276.

X. Cui and A. Alwan, "Noise Robust Speech Recognition Using Feature Compensation Based on Polynomial Regression of Utterance SNR,"IEEE Transactions on Speech and Audio Processing, Vol. 13, Number 6, pp. 1161-1172, November 2005.


2004 ↑Top

S. Narayanan and A. Alwan, "Text to Speech Synthesis:New Paradigms and Advances,"Pearson Education, Prentice Hall, August 2004.

H. You, Q. Zhu, and A. Alwan, "Entropy-base Variable Frame Rate Analysis of Speech Signals and Its Application to ASR,"in Proc. ICASSP, pp 549-552, Montreal, Canada, May. 2004.

M. Iseli and A. Alwan, "An Improved Correction Formula for The Estimation of Harmonic Magnitudes and Its Application to Open Quotient Estimation,"in Proc. ICASSP, pp 669-672, Montreal, Canada, May. 2004.

X. Cui and A. Alwan, "Combining Feature Compensatoin and Weighted Viterbi Decoding for Noise Robust Speech Recognition With Limited Adaptation Data,"in Proc. ICASSP, Pp. 969-972, Montreal, Canada, May. 2004.


2003 ↑Top

P. Keating, M. Baroni, S. Mattys, R. Scarborough, A. Alwan, E. Auer, and L. Bernstein, "Optical Phonetics and Visual Perception of Lexical and Phrasal Stress in English,"Proc. 15th International Congress of Phonetic Sciences: pp 2071-2074, 2003

M. Hasegawa-Johnson, S. Pizza, A. Alwan, J.S. Cha and K. Haker, "Vowel Category Dependence of the Relationship Between Palate Height, Tongue Height, and Oral Area,"Journal of Speech, Language, and Hearing Research, Vol. 46, Issue 3, June 2003. pp. 738-739.

M.O. Rosa, J.C. Pereira, M. Grellet and A. Alwan, "A Contribution to Simulating a Three-dimensional Larynx Model Using the Finite Element Method,"in JASA, Vol 114, Issue 5, Nov. 2003, pp. 2893-2905.

Q. Zhu and A. Alwan, "Non-linear feature extraction for robust recognition in stationary and non-stationary noise,"Computer, Speech, and Language, 17(4): 381-402, Oct. 2003.

X. Cui, Al. Bernard, and A. Alwan, "A Noise-Robust ASR Back-end Technique Based on Weighted Viterbi Recognition,"in Proc. EUROSPEECH, Switzerland, pp. 2169-2172, Sept. 2003.

Z. AlBawab, I. Locher, J. Xue, and A. Alwan, "Speech Recognition over Bluetooth Wireless Channels,"In Proc. EUROSPEECH, Switzerland, pp. 1233-1236, Sept. 2003.

W. Chen and A. Alwan, "Perpception of Place of Articulation Feature for Plosives and Fricatives in Noise,"in Proc. ICPhS, Barcelona, August, 2003. pp 195-209

James J. Hant and Abeer Alwan, "A Psychoacoustic-Masking Model to Predict the Perception of Speech-Like Stimuli in Noise,"Speech Communication, Vol. 40, May 2003, pp. 291-313.

H.F. Chi, S.X. Gao, S.D. Soli, and A. Alwan, "Band-limited Feedback Cancellation with a Modified Filtered-X LMS Algorithm for Hearing Aids,"special issue of Speech Communication on Signal Processing for Hearing Aids, Vol. 39, Issues 1-2, Jan. 2003, pp. 147-161.


2002 ↑Top

A. Bernard and A. Alwan, "Low-bitrate Distributed Speech Recognition for Packet-based and Wireless Communication", IEEE Transactions on Speech and Audio Processing, Vol. 10, Number 8, pp. 570-580, Nov. 2002.

M. Hasegawa-Johnson and A. Alwan, "Speech Coding: Fundamentals and Applications,"a chapter in the Wiley Encyclopedia of Telecommunications, Wiley, Editor: Prof. John Proakis, December 2002, Vol. 5, pp. 2340-2359.

J. Jiang, A. Alwan, P.A. Keating, E.T. Auer, and L.E. Bernstein, "On the relationship between face movements, tongue movements, and speech acoustics,"special issue of EURASIP Journal on Applied Signal Provessing on joint audio-visual speech processing, Nov. 2002, pp.1174-1188.

X. Hu, A. Alwan, V.I. Nenov, E.H. Rubinstein, and M. Bergsneider, "Estimating brain compliance based on a novel model of intracranial cerebrospinal fluid dynamics,"EMBS-BMES Proceedings, Houston, Texas, Oct. 2002.

Qifeng Zhu and Abeer Alwan, "The Effect of Additive Noise on Speech Amplitude Spectra: a Quantitative Approach,"the IEEE Signal Processing Letters, Vol. 9, Issue 9, Sept. 2002, pp. 275-277

Brian Gabelman and Abeer Alwan, "Analysis and Synthesis of Amplitude Modulation Components in Pathological Voices,"Proc. IEEE 2002 Workshop on Speech Synthesis, Santa Monica.

A. Bernard and A. Alwan, "CHANNEL NOISE ROBUSTNESS FOR LOW-BITRATE REMOTE SPEECH RECOGNITION,"ICSLP Proceedings, Denver, Colorado, Sep. 2002, Vol.3, pp.2213-2216.

X. Cui, M. Iseli, Q. Zhu, and A. Alwan, "EVALUATION OF NOISE ROBUST FEATURES ON THE AURORA DATABASES,"ICSLP Proceedings, Denver, Colorado, Sep. 2002, Vol.1, pp.481-484.

A. Bernard, X. Liu, R. Wesel and A. Alwan, "Speech transmission using rate-compatible trellis codes and embedded source coding,"IEEE Transactions on Communications, vol.50, (no.2), IEEE, Feb. 2002. pp. 309-320.

J. Jiang, A. Alwan, L.E. Bernstein, E.T. Auer, and P.A. Keating, "PREDICTING FACE MOVEMENTS FROM SPEECH ACOUSTICS USING SPECTRAL DYNAMICS,"Proc. ICME 2002, Lausanne, Switzerland, pp. 181-184.

J. Jiang, A. Alwan, L. Bernstein, E. Auer, and P. Keating, "Similarity structure in perceptual and physical measures for visual consonants across talkers,"Proc. IEEE ICASSP, 2002, Orlando, pp. 441-444.

X. Cui and A. Alwan, "Efficient Adaptation Text Design Based On The Kullback-Leibler Measure,"Proc. IEEE ICASSP, 2002, Orlando, pp. 613-616.

B. Gableman and A. Alwan, "ANALYSIS BY SYNTHESIS OF FM MODULATION AND ASPIRATION NOISE COMPONENTS IN PATHOLOGICAL VOICES,"Proc. IEEE ICASSP, 2002, Orlando, pp. 449-452.


2001 ↑Top

A. Alwan, Q. Zhu, and J. Lo, "Human and Machine Recognition of Speech Sounds and Noise,"Invited paper, Proc. of the World Mulitconference on Systems, Cybernetics,and Information, Vol XIII, pp 218-223, Florida, Aug. 2001.

Brian Strope and Abeer Alwan, "Modeling the Perception of Pitch-Rate Amplitude Modulation in Noise", in "Computational Models of Auditory Function", a book edited by Steve Greenberg and Malcolm Slaney, pp. 315-327, IOS Press, NATO Science Series, Netherlands, 2001.

A. Bernard and A. Alwan, "Joint channel decoding - Viterbi recognition for wireless applications,"Proc. EUROSPEECH 2001, Aalborg, Denmark, Vol. 4, pp. 2703-2706.

Q. Zhu, X. Cui, M. Iseli and A. Alwan, " Noise Robust Feature Extraction for ASR using the Aurora 2 Database,"Proc. EUROSPEECH 2001, Aalborg, Denmark, Vol. 1, pp. 185-188.

M. Chen and A. Alwan, "On the Perception of Voicing for Plosives in Noise,"Proc. EUROSPEECH 2001, Aalborg, Denmark, Vol. 1, pp. 175-178.

J. Jiang, A. Alwan, E. Auer, and L. Bernstein, "Predicting visual consonant perception from physical measures,"Proc. EUROSPEECH 2001, Aalborg, Denmark, Vol. 1, pp. 179-182.

L. Bernstein, J. Jiang, A. Alwan, and E. Auer, "Visual phonetics and optical phonetics,"Proc. AVSP 2001, Scheelsminde, Denmark, pp. 104-109.

A. Bernard and A. Alwan, Source and channel coding for remote speech recognition over error-prone channel,"Proc. of ICASSP 2001, Vol. 4, pp 2613-2616.

Q. Zhu and A. Alwan, "An efficient and scalable 2D DCT-based feature coding scheme for remote speech recognition,"Proc. ICASSP 2001, Vol. 1, pp. 113-116.


2000 ↑Top

Q. Zhu and A. Alwan, "Amplitude Demodulation of Speech Spectra and its Application to Noise Robust Speech Recognition,"6th International Conference on Spoken Language Processing, ICSLP 2000. Vol. 1, pp. 341-344

W. Chen and A. Alwan, "Place of Articulation Cues for Voiced and Voiceless Plosives and Fricatives in Syllable-Initial Position,"6th International Conference on Spoken Language Processing, ICSLP 2000. Vol. 4, pp. 113-116.

J. Hant and A. Alwan, "Predicting the Perceptual Confusion of Synthetic Stop Consonants in Noise,"6th International Conference on Spoken Language Processing, ICSLP 2000. Vol. 3, pp. 941-944.

J. Jiang, A. Alwan, L. Bernstein, P. Keating, and E. Auer, "On the Correlation between Facial Movements, Tongue Movements and Speech Acoustics,"6th International Conference on Spoken Language Processing, ICSLP 2000. Vol. 1, pp. 42-45.

M. Iseli and A. Alwan, "Inter- and Intra-speaker Variability of Glottal Flow Derivative using the LF Model,"6th International Conference on Spoken Language Processing, ICSLP 2000. Vol. 1, pp. 477-480.

M. Siqueira and A. Alwan, "Steady-state analysis of continuous adaptation in acoustic feedback reduction systems for hearing aids,"IEEE Transactions on Speech and Audio Processing, Vol. 8, No. 4, pp. 443-453, July 2000.

Espy-Wilson, C.Y.; Boyce, S.E.; Jackson, M.; Narayanan, S.; and A. Alwan."Acoustic modeling of American English /r/,"Journal of the Acoustical Society of America (JASA), July 2000, Vol.108, (no.1):343-56.

Srinivasamurthy, N.; Ortega, A.; Zhu, Q.; Alwan, A. "Towards efficient and scalable speech compression schemes for robust speech recognition applications,"2000 IEEE International Conference on Multimedia and Expo (ICME) Proceedings. Latest Advances in the Fast Changing World of Multimedia, NY, 30 July-2 Aug. 2000. IEEE Press, Vol. 1, pp.249-52.

Q. Zhu and A. Alwan, "On the use of variable frame rate analysis in speech recognition,"Proc. IEEE ICASSP, Istanbul, Turkey, Vol. III, pp. 1783-1786, June 2000.

S. Narayanan and A. Alwan, Noise Source models for fricative consonants,"IEEE Transactions on Speech and Audio Processing,Vol. 8, No. 3, pp. 328-344, May 2000.


1999 ↑Top

A. Alwan, "Modeling speech production and perception mechanisms and their applications to synthesis, recognition, and coding,"Fifth International Symposium on Signal Processing and its Applications, Proceedings, Brisbane, Qld, Australia, 1999, Vol. 1, pp 7

J. Hant and A. Alwan, "Modeling the masking of Formant Transitions in Noise,"Proc. of Eurospeech 99, Budapest, Hungary, Vol. 4, pp. 1895-1898.This paper was one of three papers nominated for the best student paper award in Speech Communication at Eurospeech '99.

A. Alwan, P. Bangayan, B. Garrett, J. Kreiman, and C. Long, "Analysis by synthesis of pathological voices,"an invited chapter in the book Voice Quality Measurement, R. Kent ed., pp. 307-335, Singular Publishing Group, 1999.

A. Bernard and A. Alwan, "Perceptually Based and Embedded Wideband CELP Coding of Speech,"Proc. of Eurospeech 1999, Budapest, Hungary, Vol. 4, pp. 1543-1546.

A. Alwan, S. Narayanan, B. Strope, and A. Shen, "Speech production and perception models and their applications to synthesis, recognition, and coding,"an invited chapter in the book Speech Processing, Recognition, and Artificial Neural Networks, Chollet, DiBenedetto, Esposito, and Marinaro ed., pp. 138-161, Springer-Verlag, UK, 1999.

A. Alwan, J. Lo, Q. Zhu, "Human and Machine Recognition of Nasal Consonants in Noise,"Proceedings of the 14th International Congress of Phonetic Sciences, Vol. 1 Page 167-170, August 1999, San Francisco.

A.Bernard, X. Liu, R. Wesel and A. Alwan, "Embedded Joint Source-Channel Coding of Speech using Symbol Puncturing in Trellis Code,"Proceedings of ICASSP 99, Vol. 5, pp. 2427-2430, Phoenix, March 1999.

M. Siqueira and A. Alwan, "Bias Analysis in Continuous Adaptation Systems for Hearing Aids,"Proceedings of ICASSP 99, Vol. 2, pp. 925-928, Phoenix, AZ, March 1999.


1998 ↑Top

A.Bernard, X. Liu, R. Wesel and A. Alwan, "Channel Adaptive Joint Source-Channel Coding of Speech,"Proc. of the 32nd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, November, 1998, vol. 1, pp. 357-361.

M. Siqueira and A. Alwan, "Steady-State Analysis of Continuous Adaptation Systems for Hearing Aids with a Delayed Cancellation Path,"Proc. of the 32nd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, November, 1998. Piscataway, NJ, USA: IEEE, 1998. pp. 518-22 vol.1.

J. Hant, B. Strope, and A. Alwan, "Variable-duration notched-noise experiments in a broadband noise context,"Journal of the Acoustical Society of America, Oct. 1998, vol.104, No. 4, pp. 2451-2456.

B. Strope and A. Alwan, "Modeling the perception of pitch-rate amplitude modulation in noise,"Proc. of the NATO ASI on Computational Hearing, pp. 117-122, July 1998.

B. Strope and A. Alwan, "Amplitude Modulation Cues for Detecting Voicing Distinctions in Noise,"Proceedings of the ICA/ASA Conference, Seattle, pp. 209-210, June 1998.

J. Hant, B. Strope, and A. Alwan, "Variable-Duration Notched-Noise Experiments in a Noise Context,"Proceedings of the ICA/ASA Conference, Seattle, pp. 869-870, June 1998.

Amit Rane, Derrick C. Wei, Lisa E. Falkson and A. Alwan, "Modeling the Transitory Behavior of Speech Using a Time-Varying Transmission Line Model,"Proceedings of the ICA/ASA Conference, Seattle, pp. 261-262, June 1998.

B. Gabelman, J. Kreiman, B. Gerratt, N. Antonanzas-Barroso, and A. Alwan, "Perceptually motivated modeling of noise in pathological voices,"Proceedings of the ICA/ASA Conference, Seattle, pp. 1293-1294, June 1998.

B. Gerratt, J. Kreiman, N. Antonanzas-Barroso, B. Gabelman, and A. Alwan, "Source Modeling of Severely Pathological Voices,"Proceedings of the ICA/ASA Conference, Seattle, pp. 1271-1272, June 1998.

B. Strope and A. Alwan, "Robust Word Recognition Using Threaded Spectral Peaks,"Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seattle, Vol. II, pages 625-629, May 1998.

Bergsneider M; Alwan AA; Falkson L; Rubinstein EH, "The relationship of pulsatile cerebrospinal fluid flow to cerebral blood flow and intracranial pressure: a new theoretical model,"Acta Neurochirurgica, Supplementum, 1998, 71:266-8.


1997 ↑Top

P. Bangayan, C. Long, A. Alwan, J. Kreiman, B. Gerratt, "Analysis by synthesis of pathological voices using the Klatt synthesizer,"Speech Communication, Vol. 22, No. 4, 1997, pp. 343-368.

B. Strope and A. Alwan, "Modeling auditory perception to improve robust speech recognition,"Proceedings of the 31st Asilomar Conference on Signals, Systems, and Computers, IEEE Comput. Soc, 1997, Vol. 2, pp. 1056-1060.

M. Siqueira, A. Alwan, R. Speece, "Steady-State Analysis of Continuous Adaptation Systems in Hearing Aids,"Proceedings of the IEEE workshop on Audio and Elctroacoustics, Mohonk, October, 1997.

S. Narayanan, A. Alwan, and Y. Song, "New Results in Vowel Production: MRI and EPG data,"Proceedings of Eurospeech, Vol.2, pp. 1007-1009, Patras, Greece, September 1997.

S. Roweis and A. Alwan, "Towards articulatory speech recognition,"Proceedings of Eurospeech, Vol.3, pp. 1227-1230, Patras, Greece, September 1997.

C. Espy-Wilson, S. Naraynan, S. Boyce, and A. Alwan, "Acoustic Modeling of American English /r/,"Proceedings of Eurospeech, Vol.1, pp. 393-396, Patras, Greece, September 1997.

B. Strope and A. Alwan, "A model of dynamic auditory perception and its application to robust word recognition,"IEEE Transactions on Speech and Audio Processing, Vol. 5, No. 5, pp. 451-464, September 1997.

J. Hant, B. Strope, and A. Alwan, "A psychoacoustic model for the noise masking of plosive bursts,"JASA, Vol. 101, No. 5, pp. 2789-2802, May 1997.

B. Tang, A. Shen, A. Alwan, and G. Pottie, "A Perceptually-Based Embedded Subband Speech Coder,"IEEE Transactions on Speech and Audio Processing, Vol. 5, No. 2, pp. 131-140, March 1997.

S. Narayanan, A. Alwan, and K. Haker, "Towards articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part I: The laterals,"JASA, Vol. 101, No. 2, pp. 1064-1077, February 1997.

A. Alwan, S. Narayanan, and K. Haker, "Towards articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part II: The rhotics,"JASA, Vol. 101, No. 2, pages 1078-1089, February 1997.

M. Siqueira, R. Speece, V. Petsalis, A. Alwan, S. Soli and S. Gao, "Subband Adaptive Filtering Applied to Acoustic Feedback Reduction in Hearing Aids,"Proceedings of the 30th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, 3-6 Nov 1996, Vol. 1, pp. 788-792.

Bergsneider, M., Alwan, A., and Rubinstein, E., "Venous blood flow pulsatility and impedance increases as compartment compliance decreases,"Proc. of the Tenth International ICP Symposium, 1997, Marmarou et al. ed., Acta Neurochirurgica Supplement Vol. 71, p. 417, Springer-Verlag.


1996 ↑Top

J. Hant, B. Strope, and A. Alwan, "A Psychoacoustic Model for the Noise Masking of Voiceless Plosive Bursts,"Proceedings of the Int. Conf. Spoken Lang. Processing (ICSLP), Philadelphia, pp. 570-573, October 1996.

P. Bangayan, A. Alwan, and S. Narayanan, "From MRI and Acoustic Data to Articulatory Synthesis: a Case Study of the Laterals,"ICSLP Proc., Philadelphia, 793-796, October 1996.

S. Narayanan, A. Kaun, D. Byrd, P. Ladefoged, and A. Alwan, "Liquids in Tamil,'' ICSLP Proc., Philadelphia, pp. 797-800, October 1996.

B. Strope and A. Alwan, "A Model of Dynamic Auditory Perception and its Application to Robust Speech Recognition,"Proc. of the IEEE Int. Conf. Acous. Speech Sig. Proc., Vol. I, pp. 37-40, Atlanta, May 1996.

S. Narayanan and A. Alwan, "Parametric Hybrid Source Models for Voiced and Voiceless Fricative Consonants,"ICASSP 96 Proceedings, Vol. I, pp. 337-340, Atlanta, May 1996.

Alwan, A.; Bagrodia, R.; Bambos, N.; Gerla, M.; and others, "Adaptive Mobile Multimedia Networks,"IEEE Personal Communications, vol.3, (no.2):34-51,April 1996.

S. Narayanan and A. Alwan, "Imaging Applications in Speech Production Research,"SPIE 96 Medical Imaging Proceedings, 2709, 120-131, Newport Beach, Feb. 96 (Invited).


1995 ↑Top

A. Alwan, S. Narayanan, B. Strope, and A. Shen, "Speech Production and Perception Models and their Applications to Synthesis, Recognition, and Coding,"Proc. of the Int. Symp. Sig. Sys. and Elec. (ISSSE), pp. 367-372, October 1995 (Invited).

S. Narayanan, A. Alwan, and K. Haker, "An Articulatory Study of Fricative Consonants using MRI,"JASA, Vol. 98(3), pp. 1325-1347, September 1995.

S. Narayanan, A. Alwan, and K. Haker, "An Articulatory Study of Liquid Consonants in American English,"Proc. of the Int. Con. of Phon. Sci. (ICPhS), Stockholm, Sweden, Vol. 3, pp. 576-579, August 1995.

A. Alwan, P. Bangayan, J. Kreiman, and C. Long, "Time and Frequency Synthesis Parameters for Severe Pathological Voice Qualities,"Proc. of ICPhS, Stockholm, Sweden, Vol. 2, pp. 250-253, August 1995.

M. Siqueira, A. Alwan, and P. Diniz, "Finite Precision Analysis of the Fast QRD-RLS Lattice Algorithm,"Proc. of the IEEE Intl. Symp. Ckts. Sys. (ISCAS), Vol. 3, pp. 1616-1619, Seattle, WA, May 1995.

A. Shen, B. Tang, A. Alwan, and G. Pottie, "A Robust and Variable-Rate Speech Coder,"Proc. of the IEEE Int. Conf. Acous. Speech Sig. (ICASSP) 95, Vol. I, pp. 249-252, Detroit, May 1995.

B. Tang, A. Shen, G. Pottie, and A. Alwan, "Spectral Analysis of Subband Filtered Signals,"Proc. of the IEEE ICASSP 95, Vol. II, pp. 1324-1327, Detroit, May 1995.

B. Strope and A. Alwan, "A Novel Structure to Compensate for Frequency-Dependent Loudness Recruitment of Sensorineural Hearing Loss,"Proc. of the IEEE ICASSP 95, Vol. V, pp. 3539-3542, Detroit, May 1995.

S. Narayanan and A. Alwan, "A Nonlinear Dynamical Systems Analysis of Fricative Consonants,"JASA, Vol. 97, No. 4, pp. 2511-2524, April 1995.


1994 ↑Top

S. Narayanan, A. Alwan, and K. Haker, "An MRI Study of Fricative Consonants,"Proc. of the Intl. Conf. Spoken Lang. Processing (ICSLP), Japan, Vol. 2, pp. 627-630, September 1994.

M. Siqueira, P. Diniz and A. Alwan, "Infinite Precision Analysis of the Fast QR Decomposition RLS Algorithm,"Proc. IEEE ISCAS, London, pp. 293-296, June 1994.

Z. Jiang, A. Alwan, and A. Willson, "High-Performance IIR QMF Banks for Speech Subband Coding,"Proc. IEEE ISCAS, London, pp. 493-496, June 1994.

M. Siqueira and A. Alwan, "New Techniques for Adaptive Filtering Applied to Speech Echo Cancelation,"Proc. IEEE ICASSP, Australia, pp. 265-268, April 1994.

Before 1994 ↑Top

S. Narayanan and A. Alwan, "Strange Attractors and Chaotic Dynamics in the Production of Voiced and Voiceless Fricatives,"Proc. Eurospeech, Vol. I, pp. 77-80, Berlin, September 1993.

A. Alwan, "A Perceptual Metric for Masking,"Proc. IEEE ICASSP, Vol. 2, 712-715, April 1993.

A. Alwan, "The role of F3 and F4 in identifying the place of articulation for stop consonants,"ICSLP Proc., Vol. 2, pp. 1063-1066, Canada, Oct. 1992.

A. Alwan, "Modeling Speech Perception in Noise: a Case Study of the Place of Articulation Feature,"ICPhS Proc., Vol. 2, pp. 78-81, France, August 1991.

A. Alwan, "Perceptual cues for place of articulation for the voiced pharyngeal and uvular consonants,"JASA, Vol. 86, No. 2, pp. 549-556, August 1989.

Published Abstracts ↑Top

Kantapon Kaewtip, and Abeer Alwan, "A flexible discriminative approach to automatic phone and broad phonetic group classification", The Journal of the Acoustical Society of America 141, 3468 (2017)

Kantapon Kaewtip, Abeer Alwan, and Charles Taylor, "Robust Hidden Markov Models for limited training data for birdsong phrase classification", The Journal of the Acoustical Society of America 141, 3725 (2017)

Gary Yeung, Steven M. Lulich, Asterios Toutios, Abeer Alwan, and Amber Afshan, "Analysis of children’s high front vowel area function using three-dimensional ultrasound imaging", The Journal of the Acoustical Society of America 140, 3448 (2016)

Abeer Alwan, Steven Lulich, and Harish Ariskere, "The role of subglottal resonances in speech processing algorithms", The Journal of the Acoustical Society of America 137, 2327 (2015)

Jody Kreiman, Patricia A. Keating, Soo Jin Park, Shaghayegh Rastifar, and Abeer Alwan, "Within- and between-talker variability in voice quality in normal speaking situations", The Journal of the Acoustical Society of America 137, 2418 (2015)

Marc Garellek, Gang Chen, Bruce R. Gerratt, Abeer Alwan, and Jody E. Kreiman, "Perceptual importance of time-domain features of the voice source", The Journal of the Acoustical Society of America 135, 2422 (2014)

Marc Garellek, Gang Chen, Bruce R. Gerratt, Abeer Alwan, and Jody Kreiman, "Perceptual differences among models of the voice source: Further evidence", The Journal of the Acoustical Society of America 136, 2295 (2014)

Gang Chen, Soo-Jin Park, Jody Kreiman, and Abeer Alwan, "On transition between voice registers: Data from high-speed laryngeal videoendoscopy", The Journal of the Acoustical Society of America 135, 2290 (2014)

Mitchell Sommers, Abeer Alwan, and Steven Lulich, "The role of subglottal acoustics in speech production and perception", The Journal of the Acoustical Society of America 136, 2259 (2014)

John Morton, Mitchell Sommers, Steven Lulich, Abeer Alwan, and Harish Arsikere, "Acoustic features mediating height estimation from human speech", The Journal of the Acoustical Society of America 134, 4072 (2013)

Gang Chen, Marc Garellek, Jody Kreiman, Bruce R. Gerratt, and Abeer Alwan, "A physiologically and perceptually motivated voice source model", The Journal of the Acoustical Society of America 133, 3521 (2013)

Harish Arsikere, and Abeer Alwan, "Speaker normalization in noisy environments using subglottal resonances", The Journal of the Acoustical Society of America 134, 4075 (2013)

Hitesh A. Gupta, Anirudh Raju, and Abeer Alwan, "The effect of non-linear dimension reduction on Gabor filter bank feature space", The Journal of the Acoustical Society of America 134, 4069 (2013)


Anirudh Raju and A. Alwan, "The effect of speaking rate, vowel context, and speaker intelligibility on the perception of consonant vowel consonants in noise", J. Acoust. Soc. Am, 134, 4031, 2013. [link to the abstract]

Hong You and Abeer Alwan, "The role of temporal modulation processing in speech∕non-speech discrimination tasks," J. Acoust. Soc. Am. Volume 127, Issue 3, pp. 1817-1817, 2010

Markus Iseli, Yen-Liang Shue, and Abeer Alwan, "Analysis of vowel and speaker dependencies of source harmonic magnitudes in consonant-vowel utterances,"JASA 117, 2619, 2005

Jianxia Xue, Abeer Alwan, Jintao Jiang, and Lynne E. Bernstein, "Phoneme clustering based on segmental lip configurations in naturally spoken sentences,"JASA 117, 2573, 2005

Jianxia Xue, Abeer Alwan, Edward T. Auer, Jr., and Lynne E. Bernstein, "On audio-visual synchronization for viseme-based speech synthesis,"J. Acoust. Soc. Am. 116, 2480, 2004

Markus Iseli and Abeer Alwan, "An improved correction formula for the estimation of voice source harmonic magnitudes,"JASA 115, 2610, 2004

Jul Setsu Cha and Abeer Alwan, "On the acoustic effects of piriform recesses in speech production,"J. Acoust. Soc. Am. 112, 2445, 2002

Brian C. Gabelman, Jody Kreiman, Bruce R. Gerratt, and Abeer Alwan, "Synthesis of nonperiodic features of pathological voices,"JASA, May 2001, Vol. 109, Issue 5, p. 2416

Abeer Alwan, "The 'noisy' speech chain,", JASA, Dec. 2000, Vol. 108, Issue 5, pp. 2626-2627

J. Jiang, A. Alwan, P. Keating, L. Bernstein, and E. Auer. "On the correlation between articulatory and acoustic data,"JASA, Dec. 2000, Vol. 108, Issue 5, p. 2508

Patricia A. Keating, Taehong Cho, Marco Baroni, Sven Mattys, Lynne E. Bernstein, Brian Chaney, and Abeer Alwan, "Articulation of word and sentence stress,"JASA, Dec. 2000, Vol. 108, Issue 5, p. 2466

J. Jiang, A. Alwan, P. Keating, and L. Bernstein, "On the correlation between orofacial movements, tongue movements, and speech acoustics,"JASA, May 2000, Vol. 107, Issue 5, p. 2904

M. Chen and A. Alwan, "On the perception of voicing for plosives in noise,'' JASA, May 2000, Vol. 107, Issue 5, p. 2917

Lynne E. Bernstein, Edward T. Auer, Jr., Brian Chaney, Abeer Alwan, and Patricia A. Keating, "Development of a facility for simultaneous recordings of acoustic, optical (3-D motion and video), and physiological speech data,"JASA, May 2000, Vol. 107, Issue 5, p. 2887

C. Espy-Wilson, S. Boyce, M. Jackson, A. Alwan and S. Narayanan, "Modeling the subglottal space for American English /r/,'' JASA, September 1998, Vol. 104, Issue 3, p. 1819

B. Gerratt, J. Kreiman, N. Antonanzas-Barroso, B. Gabelman, and A. Alwan,"Source modeling of severely pathological voices,'' JASA, May 1998, Vol. 103, Issue 5, p. 2892

B. Gabelman, J. Kreiman, B. Gerratt, N. Antonanzas-Barroso, and A. Alwan, "LF source model adequacy for pathological voices,'' JASA, Nov. 1997, Vol. 102, Issue 5, p. 32

C. Espy-Wilson, S. Narayanan, A. Alwan, and S. Boyce, "Modeling the acoustics of American English /r/,'' JASA, May 1997, Vol. 101, Issue 5, p. 3176

B. Strope and A. Alwan, "Dynamic auditory representations and statistical speech recognition: Threading spectral peaks for robust recognition,'' Proc. of the Acoustical Societies of Amer. and Japan, Vol. 100, No. 4, 2788, Dec. 1996. This paper received the best student paper award in Speech Communication at the ASA meeting.

P. Bangayan, A. Alwan, and S. Narayanan, "A transmission-line model of the lateral approximants'', Proc. of the Acous. Societies of Amer. and Japan, Vol. 100, No. 4, Dec. 1996.

J. Hant, B. Strope, and A. Alwan "Predicting noise-masked thresholds of plosive bursts,'' 4th Lake Arrowhead Conference on Issues in Advanced Hearing Aid Research, May 1996.

R. Speece, A. Alwan, M. Siqueira, S. Soli, S. Gao. "An Analysis of the Acoustic Feedback Path Transfer Function in Hearing Aids."Lake Arrowhead 4th Conference on Issues in Advanced Hearing Aid Research, May 1996.

B. Gabelman, J. Kreiman, B. Gerratt, and A. Alwan, "Optimization for source waveform synthesis of pathological voices,'' JASA, April 1996, Vol. 99, Issue 4, p. 2549

J. Hant, B. Strope, and A. Alwan, "Durational Effects on Masked Thresholds in Noise as a Function of Signal Frequency, Bandwidth, and Type,'' Proc. of the Acous. Soc. of Amer. (ASA), Vol. 98, No. 5, 2908, Nov. 1995.

B. Strope and A. Alwan, "A First-Order Model of Dynamic Auditory Perception,'' Proc. NIH Hearing Aid Research and Development Workshop, September 1995.

A. Alwan, M. Siqueira, S. Soli, and S. Gao, "An Analysis of the Acoustic Path Transfer Function in Hearing Aids,'' Proc. NIH Hearing Aid Research and Development Workshop, September 1995.

J. Saade, F. Zeng, J. Wygonski, R. Shannon, S. Soli, and A. Alwan "Quantitative measures of envelope cues in speech recognition,'' Proc. ASA, June 1995.

S. Narayanan, A. Alwan, and K. Haker, "Three dimensional tongue shapes of sibilant fricatives,'' JASA, Vol. 96, (5), 3342 (A), Nov. 1994.

B. Strope and A. Alwan, "Mapping of Constant Loudness Contours with Filter Mixtures in Digital Hearing Aids,'' Lake Arrowhead Conference on Hearing Aid Research, June 1994.

P. Bangayan, A. Alwan, J. Kreiman, and C. Long, "Synthesis of Severely Pathological Voices,"JASA, Vol. 95, No. 5, 1pSP5, May 1994.

C. Long, P. Bangayan, and A. Alwan, "Acoustic Analysis and Synthesis of Pathological Voice Qualities,"JASA, Vol. 93, No. 3, Pt. 2, 2aSP9, Oct. 1993. This paper received the best student paper award in Speech Communication at the ASA meeting.


M.S. Theses in Electrical Engineering ↑Top

Yunzheng Zhu, "Towards Better Automatic Speech Recognition Systems for Children," Feburary 2023

Anirudh Raju, "The effect of speaking rate and vowel context on the perception of consonantsin babble noise"

Julien van Hout,  "Low Complexity Spectral Imputation for Noise Robust Speech Recognition," May 2012

Yi-Hui Lee,  "An exploration study of the effect of voice quality on subglottal resonances," June 2010

Sankaran Panchapagesan, "Modeling the Production of /l/ Based on MRI data,"  March 2003

Ivo Locher, "Design and Implementation of iBadge and its Distributed Speech Processing Capability", September 2002

Jul Setsu Cha, "Articulatory Speech Synthesis of Female and Male Talkers,"December 2001.

Vladimir Teplitsky, "A Noise Robust Speech Enhancement Algorithm for Cochlear Hearing Loss,"September 2001.

Marcia Chen, "Perception of Voicing for Syllable-Initial Plosives in Noise,"June 2001.

Willa Chen, "Perception of Place of Articulation for Syllable-Initial Consonants in Noise,"June 2001.

Steve Chen, "Segregated and Redundent Hidden-Markov Models for Alphabet Recognition in Cars,'' September 1999.

Alexis Bernard, "Source-Channel Coding of Speech'' (pdf), ps, pdf.zip, and ps.zip, December 1998.

Lisa Falkson, "A circuit model for studying the dynamics of the intracranial compartment,'' July 1998.

Jeff Lo, "Perception and recognition of nasal consonants in quiet and in noise,'' July 1998.

Amit Rane, "Forward and Inverse Mapping of the Vocal Tract,'' July 1998.

Vaggelis Petsalis, "Automatic speech recognition of isolated digits in noise,'' January 1997.

Wayne Bayever, "Design and implementation of a formant vocoder,'' September 1996.

Philbert Bangayan, "A transmission-line model of /l/ based on MRI-derived data,'' September 1996.

James Hant, "A psychoacoustic model to predict the noise masking of plosive bursts,'' June 1996.

Yong Song, "Finite time-difference simulations of speech production,'' July 1995.

Brian Strope, "A model of dynamic auditory perception and its application to robust speech recognition,'' June 1995.

Albert Shen, "Perceptually-based subband coding of speech signals,'' June 1994


Ph.D. Dissertations in Electrical Engineering ↑Top

Jinhan Wang, "Towards Better and Privacy-Preserving Speech Modeling for Depression Detection", June 2024.

Ruchao Fan, "Improving the Accuracy and Inference Efficienty for Low-resource Automatic Speech Recognition", March 2024.

Alexander Johnson, "Towards Inclusive Low-Resource Speech Technologies: A Case Study of Educational Systems for African American English-Speaking Children", January 2024.

Gary Yeung, "Speech Normalization and Data Augmentation Tecnhiques Based on Acoustical and Psysiological Constraints and Their Applications to Child Speech Recognition", August 2021.

Amber Afshan, "Speaking Style Variability in Speaker Discrimination by Humans and Machines", March 2022.

Jinxi Guo, "Neural network based representation learning and modeling for speech and speaker recognition", June 2019.

Soo Jin Park, "Towards Understanding Voice Discrimination Abilities of Humans and Machines", March 2019.

Kantapon Kaewtip, "Robust Automatic Recognition of Birdsongs and Human Speech: a Template-Based Approach", 2017.

Harish Arsikere, "On the role of subglottal acoustics in height estimation, and speech and speaker recognition", June 2014.

Lee Ngee Tan, "Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations", June 2014.

Gang Chen, "The Voice Source in Speech Production: from Models to Applications," May 2014.

Wei Chu, "Noise Robust Signal Processing for Human Pitch Tracking and Bird Song Classification and Detection," December 2011.

Jonas Borgstrom, "Inference of Missing or Degraded Data for Noise Robust Speech Processing," June 2010.

Yen-Liang Shue, "The Voice Source in Speech Production: Data, Analysis and Models," March 2010.

Shizhen Wang, "Rapid Speaker Normalization and Adaptation with Applications to Automatic Evaluation of Children's Language Learning Skills," March 2010.

Hong You, "Robust Automatic Speech Recognition Algorithms for Dealing with Noise and Accent," August 2009.

Jianxia Xue, "Acoustically-Driven Talking Face Animations Using Dynamic Bayesian Network," December 2008.

Sankaran Panchapagesan, "Frequency Warping by Linear Transformation, and Vocal Tract Inversion for Speaker Normalization in Automatic Speech Recognition," June 2008.

Markus Iseli, "Dependencies of voice source measures on age, sex, vowel context, and prosodic features,'' June 2007.

Xiaodong Cui, "Environmental and Speaker Robustness in Automatic Speech Recognition with Limited Learning Data,'' August 2005.

Jintao Jiang, "Relating Optical Speech to Speech Acoustics and Visual Speech Perception,'' October 2003.

Brian Gabelman, "Analysis and Synthesis of Pathological Vowels,'' August 2003.

Alexis Bernard, "Source and Channel Coding for Speech Transmission and Remote Speech Recognition [pdf] [ ps]", March 2002.

Qifeng Zhu, "Noise Robust Front-End Processing for Automatic Speech Recognition", December 2001.

James Hant, "A Computational Model to Predict Human Perception of Speech in Noise", June 2000.

Fred Chi, "Adaptive Feedback Cancellation for Hearing Aids: Theories, Algorithms, Computations, and Systems'', November 1999.

Marcio Siqueira, "Adaptive filtering algorithms in acoustic echo cancellation and feedback reduction'', September 1998.

Brian Strope, "Modeling auditory perception for robust speech recognition'', August 1998.

Shrikanth Narayanan, "Fricative consonants: an articulatory, acoustic, and systems study'', June 1995.

spacer spacer