Design and implementation of the ERBlet transform
given at flame12 (08.05.12 11:30)
Time-frequency representations are widely used in audio applications involving sound analysis-synthesis. For such applications, obtaining a time-frequency transform that accounts for some aspects of human auditory perception is of high interest. To that end, we exploit the theory of non-stationary Gabor frames to obtain a perception-based, linear, and perfectly invertible time-frequency transform. Our goal is to design a non-stationary Gabor transform (NSGT) whose time-frequency resolution best matches the time-frequency analysis properties by the ear. The peripheral auditory system can be modeled in a first approximation as a bank of bandpass filters whose bandwidth increases with increasing center frequency. These so-called “auditory filters” are characterized by their equivalent rectangular bandwidths (ERB) that follow the ERB scale. Here, we use a NSGT with resolution evolving across frequency to mimic the ERB scale, thereby naming the resulting paradigm “ERBlet transform”. Preliminary results will be presented. Following discussion shall focus on finding the “best” transform settings allowing to achieve perfect reconstruction while minimizing redundancy.