![]() |
This function is a high quality sample rate convertor as well as a variable sample rate conversion utility (varispeed). The Varispeed box enables this feature. The Varispeed Function... button will bring up the Draw Function Window, giving one control over a 10 octave varispeed. The Quality buttons give one control over the size of smoothing filter used, and the resultant quality of interpolation/decimation. The Vary by Scale and Vary by Pitch buttons allow one to draw a curve for either pitch or scaling factor. |
(Command - X) Spectral Extraction...
This
function attempts to separate the stable (pitched) and transient (unpitched)
parts of a sound. It does this by measuring the speed of frequency deviation.
If the deviation is too quick, it is marked unpitched information and output
to the transient soundfile. If it is too stable, it is marked pitched information
and output to the stable soundfile.
You can control the separation with this dialog box. Set the Bands to a high number if the sound being processed is harmonically dense, otherwise keep it around 512. Setting the number of Frames allows you to set the size of the analysis frame (in multiples of FFT frames). Set this higher if you are having difficulty separating the pitched material, lower if you are having difficulty separating the transient material. The two frequency values specify the amount of change allowed during each analysis frame. In this example, if the harmonic deviates by more than 5 hertz in 0.035 seconds it is put into the transient soundfile, if the harmonic deviates by less than 2 hertz in 0.035 seconds, it is put in the stable soundfile.
This function is draws heavily from the work of Zack Settel and Cort Lippe on the ISPW workstation using Max-DSP. Thank you Zack and Cort for sharing a great idea.
(Command - -) Spectral Analysis/Resynthesis...
This
function will create a Csound or SoundHack format spectral data file from
a soundfile (analysis) or create a soundfile from either format spectral
data file (resynthesis). With this, you can create files to be used with
Csound's pvoc unit generator or for you to use with your own programs to
process the spectral data directly. The format of the spectral data file
(a limited version of the Csound pvanal format) is a header, followed by
multiple frames of spectral data.
Here is a C structure describing the header:
typedef struct
{
long magic; // 517730
for Csound files,
// 'Erbe' for SoundHack files
long headBsize; // byte offset from start to
data
// (usually sizeof(SpectHeader))
long dataBsize; // number of bytes of data not
including
// the header
long dataFormat;// (short) format specifier
// always 36 for floating point
float samplingRate;
long channels;
long frameSize; // number of points in FFT
// (number of bands * 2)
long frameIncr; // number of new samples each
frame
// (frames overlap)
long frameBsize;// bytes in each file frame
// frameBsize = sizeof(float)
// * (frameSize >> 1 + 1) << 1;
long frameFormat; // this is either 3 for SoundHack
// files (amplitude & phase) or 7 for
// Csound files (amplitude & frequency)
float minFreq; // 0.0
float maxFreq; // maxFreq = samplingRate/2.0;
long freqFormat; // flag for log/lin frequency
// (always 1 for linear)
char info[4]; // extendible byte area
} SpectHeader;
The subsequent frames of spectral data are organized as follows:
// frameFormat == 3, amplitude and phase pairs, SoundHack file
typedef struct
{
float amplitude; // from 0.0 to 1.0
float phase; // from 0.0 to (2.0 * pi)
} band;
band spectralFrame[(frameSize >> 1) + 1];
// frameFormat == 7, amplitude and frequency pairs, Csound file
typedef struct
{
float amplitude; // from 0.0 to 1.0
float frequency; // from 0.0 to samplingRate/2.0
} band;
band spectralFrame[(frameSize >> 1) + 1];
If the spectral file is stereo, the frames are interleaved, first left then right. Included with SoundHack is the source code for a simple spectral data processor (Spectral Assistant) which should illustrate how to read and write this format.
This
function will convert your soundfile into a QuickTimeô movie andwill convert
the QuickTimeô movie back into the same sound. You can also turn any QuickTimeô
movie into a soundfile.
The QuickTimeô movie will contain a series of sonograms representing your sound. The sonograms have linear frequency on the vertical axis and time on the horizontal axis. Color saturation is used to represent ampiltude and hue is used to represent delta phase.
There are very few settings in this function. The number of bands one chooses will affect the size of the movie frame. 256 bands will require a 343x257 frame (a 4:3 aspect ratio). Since the QT Coder uses 32-bit uncompressed images, it consumes a lot of memory, and the memory need increases exponentially as the number of bands increase. For example: a 512 band, monaural QT Coding will require about 3 megabytes allocated to SoundHack.
Window: selects the FFT windowing function. Kaiser is best for band separation, Hamming is best for smooth band transition. The choice of window is probably not critical to most uses of this function. ÆPhase Center Color:selects which color will represent a Æ phase change of zero degrees. This color usually becomes the predominant color in the sonogram. Amplitude Range (dB): determines which sounds are encoded in the QuickTimeô movie. For a high fidelty encoding, 120 dB seems sufficient. Color Inversion inverts the RGB color values.
After the QT Coder creates the QuickTimeô movie, you will be able to open it in any QuickTimeô application and view, edit or modify the sonogram. After doing this, you will be able to open the modified movie in SoundHack and convert it back into sound using the QT Coder again.
One could also use the QT Coder to convert any QuickTimeô
movie into a soundfile. The resultant sound will tend to be repetitive,
it will also tend to be biased toward the high frequencies since the frequency
in the frame is interpreted in a linear fashion.
![]() |
This function is rather experimental and I am not quite sure what it
will be good for yet.
For future versions I am working on an exponential frequency encoding scheme as well as color schemes which follow visual perception models. (name that tune?) |
This does a simple, no-questions asked, normalization of the selected
soundfile.