-
Notifications
You must be signed in to change notification settings - Fork 10
aud.mfcc
Computes Mel-Frequency Cepstral Coefficients, offering a description of the spectral shape. First, a Fourier transform is computed using aud.spectrum. The frequency bands are positioned logarithmically (on the Mel scale) which approximates the human auditory system's response more closely than the linearly-spaced frequency bands. Then, periodicity in the spectrum distribution is evaluated using a discrete cosine transform (DCT), which is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. It has a strong "energy compaction" property: most of the signal information tends to be concentrated in a few low-frequency components of the DCT. That is why by default only the first 13 components are returned. By convention, the coefficient of rank zero simply indicates the average energy of the signal.
aud.mfcc accepts either:
- sig.Spectrum objects, or
- sig.signal objects (same as for sig.spectrum),
- file name or the ‘Folder’ keyword.
aud.mfcc can return several outputs:
- the mfcc coefficients themselves and
- the spectral representation (output of aud.spectrum), in mel-band and log-scale
- aud.mfcc(..., ‘Frame’, ...) performs first a frame decomposition, with by default a frame length of 50 ms and half overlapping. For the specification of other frame configuration using additional parameters, cf. this page.
- aud.mfcc(..., 'Bands’, b) indicates the number of bands used in the mel-band spectrum decomposition. By default, b = 40.
- aud.mfcc(..., 'Rank’, N) computes the coefficients of rank(s) N. The default value is N = 1:13. Beware that the coefficient related to the average energy is by convention here of rank 0. This zero value can be included to the array N as well.
If the output is frame-decomposed, showing the the temporal evolution of the MFCC along the successive frames, the temporal differentiation can be computed:
- aud.mfcc(..., 'Delta’, d) performs temporal differentiations of order d of the coefficients, also called delta-MFCC (for d = 1) or delta-delta-MFCC (for d = 2). By default, d = 1.
- aud.mfcc(..., 'Radius’, r) specifies, for each frame, the number of successive and previous neighbouring frames taken into consideration for the least-square approximation used for the derivation. For a given radius r, the Delta operation for each frame i is computed by summing the MFCC coefficients at frame i+j (with j from -r to +r) , each coefficient being multiplied by its weight j. Usually the radius is equal to 1 or 2. Default value: r = 2.
Accessible using the get
method.
-
'Rank'
: the ranks associated to each bin (same as'Xdata'
), -
'Delta'
: the number of times the delta operation has been performed.