-
Notifications
You must be signed in to change notification settings - Fork 10
sig.autocor
Analysis of periodicity in signals by looking at local correlation between samples.
If we take a signal x, such as for instance this trumpet sound:

the autocorrelation function is computed as follows:
For a given lag j, the autocorrelation Rxx(j) is computed by multiplying point par point the signal with a shifted version of it of j samples. We obtain this curve:
Hence when the lag j corresponds to a period of the signal, the signal is shifted to one period ahead, and therefore is exactly superposed to the original signal. Hence the summation gives very high value, as the two signals are highly correlated.
sig.cepstrum
accepts as input data type either:
-
sig.Signal
objects, where the waveform can be segmented (usingsig.segment
), decomposed into channels (usingsig.filterbank
), and/or decomposed into frames (usingsig.frame
); - file name(s) or the
'Folder'
keyword - data in the onset detection curve category (cf.
aud.events
):-
sig.Envelope
objects, frame-decomposed or not - fluxes (cf.
sig.flux
), frame-decomposed or not
-
-
sig.Spectrum
objects -
sig.Autocor
objects, for further processing
sig.autocor(…,'Frame',…)
performs first a frame decomposition, with by default a frame length of 50 ms and half overlapping. For the specification of other frame configuration using additional parameters, cf. this page.
-
sig.autocor(…,'Min',
mi)
indicates the lowest delay taken into consideration. Default value: 0 s. The unit can be specified:
-
sig.autocor(…,'Min',
mi,'s')
(default unit)
-
sig.autocor(…,'Min',
mi,'Hz')
-
-
sig.autocor(…,'Max',
ma)
indicates the highest delay taken into consideration. The unit can be specified as for'Min'
. Default value:- if the input is an audio waveform, the highest delay is by default 0.05 s (corresponding to a minimum frequency of 20 Hz).
- input has a resolution lower than 1000 Hz, there is no highest delay by default.
- if the input is an envelope, the highest delay is by default 2 s.
-
sig.autocor(…,
n)
specifies a normalization option for the cross-correlation ('biased', 'unbiased', 'coeff', 'none', 'coeffXchannels'
). This corresponds exactly to the normalization options in Matlab xcorr function, assig.autocor
actually calls xcorr for the actual computation. The default value is 'coeff', corresponding to a normalization so that the autocorrelation at zero lag is identically 1. An additional option'coeffXchannels'
behaves like'coeff'
except that if the data is multi-channel, the normalization is such that the sum over channels at zero lag becomes identically 1. Note however that the'coeff'
routine is not used when the compression ('Compres'
) factor k is not equal to 2 (see below).
-
sig.autocor(…,'Freq')
represents the autocorrelation function in the frequency domain: the periods are expressed in Hz instead of seconds (see the last curve in the figure below for an illustration).
-
sig.autocor(…,'NormalWindow')
divides the autocorrelation by the autocorrelation of the window. Boersma (1993) shows that by default the autocorrelation function gives higher coefficients for small lags, since the summation is done on more samples. Thus by dividing by the autocorrelation of the window, we normalize all coefficients in such a way that this default is completely resolved. At first sight, the window should simply be a simple rectangular window. But Boersma (1993) shows that it is better to use'hanning'
window in particular, in order to obtain better harmonic to noise ratio.
-
sig.autocor(…,'NormalWindow',
w)
specifies the window to be used, which can be any window available in the Signal Processing Toolbox. Besides w = 'rectangular' will not perform any particular windowing (corresponding to a rectangular (“invisible”) window), but the normalization of the autocorrelation by the autocorrelation of the invisible window will be performed nonetheless. The default value is w ='hanning'
. except when the input is an envelope or if the resolution is below 1000 Hz, in which case there is no windowing.
-
sig.autocor(…,'NormalWindow','off')
toggles off this normalization (which is'on'
by default).
-
-
sig.autocor(…,'Halfwave')
performs a half-wave rectification on the result, in order to just show the positive autocorrelation coefficients.
sig.autocor(…,'Compres',
k)
– or equivalently sig.autocor(…,'Generalized',
k)
– computes the autocorrelation in the frequency domain and includes a magnitude compression of the spectral representation. Indeed an autocorrelation can be expressed using Discrete Fourier Transform as:
y = IDFT(|DFT(x)|2),
which can be generalized as:
y = IDFT(|DFT(x)|k)
Compression of the autocorrelation (i.e., setting a value of k lower than 2) are recommended in (Tolonen & Karjalainen, 2000) because this decreases the width of the peaks in the autocorrelation curve, at the risk however of increasing the sensitivity to noise. According to this study, a good compromise seems to be achieved using value k = .67. By default, no compression is performed (hence k = 2), whereas if the 'Compress'
keyword is used, value k = .67 is set by default if no other value is indicated.
In the autocorrelation function, for each periodicity in the signal, peaks will be shown not only at the lag corresponding to that periodicity, but also to all the multiples of that periodicity. In order to avoid such redundancy of information, techniques have been proposed that automatically remove these harmonics. In the frequency domain, this corresponds to sub-harmonics of the peaks.
sig.autocor(…,'Enhanced',
a)
: The original autocorrelation function is half-wave rectified, time-scaled by factor a (which can be a factor list as well), and subtracted from the original clipped function (Tolonen & Karjalainen, 2000). If the 'Enhanced'
option is not followed by any value, the default value is a = 2:10, i.e., from 2 to 10.
If the curve does not start from zero at low lags but begins instead with strictly positive values, the initial discontinuity would be propagated throughout all the scaled version of the curve. In order to avoid this phenomenon, the curve is modified in two successive ways:
- if the curve starts with a descending slope, the whole part before the first local minimum is removed from the curve,
- if the curve starts with an ascending slope, the curve is prolonged to the left following the same slope but which is increased by a factor of 1.1 at each successive bin, until the curve reaches the x-axis.
See the figure below for an example of enhanced autocorrelation when computing the pitch content of a piano Amin3 chord, with the successive step of the default enhancement, as used by default in sig.pitch
(cf. description of SigPitch).
1. |
2. |
---|
3. |
4. |
---|
5. |
6. |
---|
7. |
---|
- Fig 1: Waveform autocorrelation of a piano chord Amaj3 (blue), and scaled autocorrelation of factor 2 (red);
- Fig 2: Subtraction of the autocorrelation by the previous scaled autocorrelation (blue), scaled autocorrelation of factor 3 (red);
- Fig 3: Resulting subtraction (blue), scaled autocorrelation of factor 4(red);
- Fig 4: Idem for factor 5;
- Fig 5: Idem for factor 6;
- Fig 6: Idem for factor 7;
- Fig 7: Resulting autocorrelation curve in the frequency domain and peak picking
Music-theory representation of autocorrelation function is available in mus.autocor.
Accessible using the get
method.
-
To obtain the lag or frequency values associated to each bin, we recommend using
'Xdata'
, and using also'FreqDomain'
to check if those values are lags ('FreqDomain'
= 0) or frequencies ('FreqDomain'
= 1) -
'OfSpectrum'
: whether the input is a temporal signal (0), or a spectrum (1) -
'Window'
: contains the complete envelope signal used for the windowing