Constructing an Invertible Constant-Q Transform with Nonstationary Gabor Frames



Introduction

This webpage provides resources and experimental results for the accompanying research paper

M. Dörfler, N. Holighaus, T. Grill and G. Velasco, “ Constructing an Invertible Constant-Q Transform with Nonstationary Gabor Frames," to appear in Proceedings of the 14th International Conference on Digital Audio Effects (DAFx 11), Paris, France, 2011

Abstract. An efficient and perfectly invertible signal transform featuring a constant-Q frequency resolution is presented. The proposed approach is based on the idea of the recently introduced nonstationary Gabor frames. Exploiting the properties of the operator corresponding to a family of analysis atoms, this approach overcomes the problems of the classical implementations of constant-Q transforms, in particular, computational intensity and lack of invertibility. Perfect reconstruction is guaranteed by using an easy to calculate dual system in the synthesis step and computation time is kept low by applying FFT-based processing. The proposed method is applied to real-life signals and evaluated in comparison to a related approach, recently introduced specifically for audio signals.

More material on nonstationary Gabor transforms:

Theory and Implementation of nonstationary Gabor Frames

Download the NSGT toolbox here.

Nonstationary Gabor Toolbox V0.01

Experiments on constant-Q nonstationary Gabor transform (CQ-NSGT):

        1. Examples of CQ-NSGT spectrograms

        2. Masking experiments

        3. Transposition experiments








Examples of CQ-NSGT spectrograms

The frequency-side version of the nonstationary Gabor transform allows one to construct a perfectly invertible constant-Q transform. Below are examples of CQ-NSGT spectrograms of the Glockenspiel signal. The minimum frequency for both representations is set set at 200 Hz, while the bins per octave for the two representations are set at 12 and 48, respectively.




For comparision, below are spectrograms of the Glockenspiel signal: a standard Gabor spectrogram obtained by using a Hann window of length 1024 samples with a hop-size of 512, and a constant-Q transform spectrogram (from the original formulation of J. Brown) with minimum frequency set at 200 Hz and 48 bins per octave.




The following are additional signals as well as the corresponding CQ-NSGT spectrogram:

CQ-NSGT spectrogram

sound file

min frequency

Bins per octave

Ligeti spectrogram

Ligeti

130 Hz

48

Hancock spectrogram

Hancock

130 Hz

48

Tufrial spectrogram

Tufrial

130 Hz

48

Fugue spectrogram

Fugue

130 Hz

48

Kafziel spectrogram

Kafziel

130 Hz

48

Sretil spectrogram

Sretil

130 Hz

48

Chirp spectrogram

Chirp

130 Hz

48



Masking experiment

The perfect reconstruction property of CQ-NSGT can be used to cut out components from a signal by directly modifying the time-frequency coefficients. The example shows a mask for extracting a portion from the Glockenspiel signal, along with the masked spectrogram, as well as the spectrograms of the synthesized, processed signal and remainder.





Results of the masking experiment: Original signal, Processed signal, Remainder



Transposition experiment

It is possible to use the invertibility of the CQ-NSGT to easily transpose a complete harmonic structure by translation along the frequency bins. We illustrate this by transposing a chord, played on a piano. Note that the onset has been damped since transient events have to be handled separately to avoid phasing effects.





Sound files: Original signal, 2 semitone transpose, 5 semitone transpose