Children's ASR System Development
End-to-end models are better than hybrid models in terms of WER performance..
Techniques developed:
(1) Data augmentation
- Using F0-based normalization as data augmentation to increase data variety.
- LPC-based augmentation method.
(2) Self-supervised Learning.
- Bidirectional autoregressive predictive coding for better speech pretraining.
- DRAFT framework to reduce the mismatch between pretraining and finetuning.
(3) Non-autoregressive models.
- CTC alignment-based non-autoregressive transformer for faster inference (on-device deployment).
- CASS-NAT series.
Children's ASR, Data Augmentation, Self-supervised Learning, Non-autoregressive Models
Prof. Abeer Alwan, SPAPL, UCLA
Alexander Johnson, SPAPL, UCLA
Ruchao Fan, SPAPL, UCLA
Back to SPAPL Home Page.
Abeer Alwan
(alwan@ee.ucla.edu)