This
process allows one to change pitch without changing the length of the soundfile
or to change length without changing pitch. It does this by extracting
amplitude and phase information for from 8 to 4096 frequency bands with
a bank of filters. If time stretching is desired these phase and amplitude
envelopes are lengthened (or shortened, for time compression), and then
given to a bank of oscillators with corresponding frequencies to the filters.
For pitch shifting, the envelopes are untouched, and given to a bank of
oscillators with frequencies related by the pitch ratio.
To use the phase vocoder, set the number of Bands to the number of filter-oscillator pairs one would like to use. A large number of bands will give one better frequency resolution, a small number of bands will give one better time resolution. The Window menu allows one to chose different pre-FFT windows for different filtering characteristics. Only the Hamming, von Hann and Kaiser will give good results (the others are there only because I wanted to use a single menu throughout the program for all window selection). The Overlap setting adjusts the size of the filter window (relative to the number of filter bands) for analysis and synthesis and thus, the sharpness of the filter. A large setting (4x) will give the sharpest filter. A sharper filter will differentiate better between frequencies which are between bands, but responds to amplitude changes slower. Click the Time Scale button for time scaling, Pitch Scale for pitch scaling. Type the scale factor in the Scale box. Click on the word Scale (a popup menu) to specify time scaling by the length desired, or pitch scaling by equal tempered semitones. If one wants the time expansion factor or the pitch transposition factor to change during processing, click the Scaling Function box, and the Draw Function... button. This will bring up the Draw Function Window, which is described later in this document.
Resynthesis Gating performs a simple spectral gate which lets only some of the spectral data through. If a band is below the Minimum Amplitude it is not let through. Threshold Under Max. cuts off all bands which are lower than the threshold below the peak band in a given block of samples. So if the peak band has an amplitude of -7 dB and Threshold Under Max. is set to -40 dB, all bands below -47dB will be cut off.
(Command - D) Spectral Dynamics...
This
process performs standard dynamics processing (gating, ducking, expansion,
compression) on each spectral band individually. It has individual threshold
detection for each band, so that one band could have the dynamics process
active, while another is inactive. The process can be limited to affect
only a specific frequency range. One can select whether to affect sounds
which are above the threshold or sounds which are below the threshold.
The threshold level can be set to one value for all bands, or it can be
set to a different value for each band by reading in and analyzing a soundfile.
This soundfile's amplitude spectrum is used for the thresholds for each
band. This is especially useful if there is a sound that one wants to emphasize
or de-emphasize (hiss or hum).
Most controls are self-explanatory. The first popup menu allows you to select the type of process to use; gating/ducking, expansion or compression. The second popup sets the number of filter bands to separate the sound into. 512 is a good compromise for the number of bands at a 44100 sample rate as each band is about 43 Hz apart and the filters used have a (512*2)/44100 or .023 second delay. In other words, a pretty good frequency resolution (provided no partials are closer than 43 Hz) and not too much time smearing.
The Highest Band and Lowest Band boxes allow one to limit the frequency range affected. 3rd Octave Band Grouping causes the bands to be grouped with a 3rd octave spacing with a threshold trigger for each group (instead of each band). This octave grouping may give a more natural sounding dynamics process. The next box is either labelled Gain/Reduction, Expand Ratio or Compress Ratio. It allows you to set the amount of gain or reduction for the bands which are past the threshold when gating. For compression and expansion it allows you to set the gain ratio. When affecting sounds below the threshold, the compressor and expander hold the highest level steady and affect lower levels (also known as "downward" expansion or compression) . When the process is set to affect sounds above the threshold, the compressor and expander hold the threshold level steady and compress/expand up from there.
Use Smoothed Amplitude to avoid abrupt gating.
Instead of comparing the input soundfile directly to the threshold, a windowed
average of the input is compared to the threshold. This setting allows
one to reduce the "martian voices" effect (a common problem in spectral
dynamics processing) and is particularly nice for hiss and hum removal.
Attack/Decay Time allows one to set the speed that each triggered
band opens or closes. The default is the minimum time for the number of
bands used. If this value is set too slow, one loses transients, too fast
and you start modulating the soundfile (whistling sounds). Threshold
Level is where you set the threshold level! If you check Threshold
Relative to Peak Amp., the threshold is now set relative to the peak
amplitude for each block of sound processed. For example, if you are using
the spectral gate, and the loudest frequency band for the current block
of sound has an amplitude of -12 dB and the threshold is set at -40 dB,
the gate will be active for sounds below -52 dB.