The auditory system is also constantly adapting to a changing acoustic environment. Auditory perception is, therefore, also dynamic: how we hear sounds now is largely a function of the sounds we just heard. The goal of this work is to derive a computational model of dynamic auditory perception that predicts aspects of this dynamic sensitivity.
The structure of the model is similar to a multi-channel compression hearing aid: additive adaptation stages follow each output of critical-band-like filter bank. The filter bank models auditory frequency selectivity by separating sounds into appropriate bands. Following each filter, an adaptation stage adjusts as a function of the changing level of the signal in that band. The adaptation stages provide an increasing offset for quiet sounds and a decreasing offset for louder sounds.
We use perceptual pure-tone forward masking data to parameterize the model. Experiments which vary masker levels, and probe delays, lead to recovery (upward adaptation) parameters, while those that vary the masker duration are used to derive attack (downward adaptation) parameters.
After the adaptation stages we use a peak isolation algorithm based on raised sin cepstral liftering to isolate and identify dynamic spectral peaks.
Bark-frequency scale spectrograms of the words "nine six one three" before and after the adaptive processing of the model are available, as well as an overview of the performance improvements obtained using this model for robust speech recognition.