TUTORIAL: 7 points of departure

In this section I will take you step by step through some common sound processing operations. For all of these you will need a source soundfile to process and enough disk space to hold the result.

1. Spatialization

SoundHack provides spatialization of monaural soundfiles with its binaural filter. This filter, known as the head-related transfer function (HRTF) emulates the filtering action of your head and outer ear for any position around your head. The following example will spin a sound twice around your head.

Type command - O to open the soundfile to be processed. This should be a monaural file with a sample rate of 44100 as the filters are tuned for this sample rate. The open dialog box will show both the sample rate and number of channels for the selected file.

Note that some sounds spatialize better than others. One of the main cues for spatialization is the inter-aural time difference or ITD. This is the delay from the moment the sound is heard at the near ear to the time the sound is heard at the far ear. Because the ITD is so important, sounds with little temporal variation will not spatialize well. For example, a snare drum is much easier to locate than a constant flute tone. In addition to the ITD, keep in mind that most of the spectral differences between the near and far ear are above 1500 Hz, as frequencies below this point will diffract around your head. Sounds with little high frequency energy will therefore spatialize poorly.

  Once the soundfile is open, type command - B to bring up the binaural filter dialog. Here you can set the various parameters for a binaural spatialization. For instance, you could set a specific azimuth angle by typing in the Angle: box or by clicking on one of the radio buttons. You can set the height to be above, below or at ear level (though the detection of height is usually dependent on head movement, so it is not well simulated with a static filter).

To spin the sound around your head, click the box called Moving Angle. When you do this, the Angle Function button will appear. Click on this button to bring up the Draw Function Window.

This window allows you to draw a curve to control the angle of the sound as
it moves around your head. You will see two legends on the bottom of this window: Time:, which indicates the current time in the input soundfile, and Degrees:, which is the azimuth (0 degrees is straight ahead). You will now use one of the presets to smoothly spin a sound twice around your head. First, type "2.0" in the Cycles box, hen click on the ramp function icon (see the mouse arrow in the picture to the left). This creates two cycles of a ramp function, which will control the azimuth angle during processing. Now click the Done button in this dialog (the Draw Function Window will disappear), and then click the Process button in the binaural function dialog.
 

 
The "Save Soundfile as: " dialog will now appear, which has options to set the soundfile type and format. You will probably just want to set the soundfile type to Audio IFF and 16 Bit Linear, as this is the format most commonly used by sound editors. After you click Save, SoundHack will start processing. All you need to do now is wait. On a 33MHz 68040, the compute time to real time ratio is 55 to 1 (that is, it takes 55 seconds to compute 1 second of sound). While processing, the status/playback window (shown below) will show the current positions in the input and output soundfiles as well as a progress bar. At any time, you can pause the processing by typing command - , and listen to the output soundfile by pressing the spacebar. After it is finished playing, another command - , will start the processing again.
 
 

2. Time distortion nr. 1

One of the main features of SoundHack is the ability to distort the timebase and/or pitch of the soundfile. In this example, we will use a simple varispeed (variable sample rate conversion) to modulate both the pitch and timebase.

Open a soundfile as before (command - O). Bring up the Varispeed dialog by typing command - V. To enable varispeed (rather than simple sample rate conversion), click on the Varispeed box. Now, bring up the Draw Function Window by clicking on the Varispeed Function... button (see mouse arrow in picture above).
 

When the Draw Function Window appears, draw a varispeed function by click/dragging the mouse in the function drawing area (see mouse arrow at left). This varispeed, unlike a tape varispeed, is capable of a 12 octave range. That is equivalent to a tape machine that has speeds from 0.5 to 1,920 inches per second, so take advantage of this broad range. Click Done in the Draw Function Window and Process in the Varispeed dialog to start processing. On a 33MHz 68040, with the varispeed quality set to Medium, the average process time to real time ratio is 43 to 1. Slowing the soundfile takes more processing time, as more sound is created.

3. Pitch shifting

SoundHack provides pitch shifting without time scaling with a technique known as the phase vocoder. In this technique, the sound to be shifted is sent into a bank of band-pass filters which are evenly spaced from 0 Hz to half the sample rate. SoundHack measures the amplitude and phase for each frequency at the output of this filter bank,. These amplitudes, phases and frequencies are used to control a bank of oscillators. Pitch shifting simply involves multiplying each frequency by a factor.

In this example, you will shift a soundfile up one octave. You should open a soundfile as you did in the previous example (command - O). Pitch shifting is not limited by sample rate or number of channels; any type of soundfile will work. However, this effect sounds best with sounds that are not harmonically dense. If more than one partial appears in any band of the filter bank, intermodulation distortion will result in that band.

Once you have opened the soundfile to be processed, type command - P and the Phase Vocoder dialog will appear (see box at left). Here there are many options; most are related to particulars of the analysis filter bank, and most can remain at their default setting. The phase vocoder can be set for a one octave pitch shift with just a few steps. First, click on the Pitch Scale button. Second, click on the Scaling:/Semitone Shift: popup menu and select Semitone Shift:. Third, enter "12.0" in the box to the right of the Semitone Shift: popup menu. At this point you could click on Process and start processing, but there is another parameter you might want to adjust first. Bands: sets the number of bands in the filter bank as well as the number of oscillators for resynthesis. This will allow you to work with sounds of varying degrees of harmonic density. You might think that a large number of bands is always preferable, but more bands require tighter filters which in turn causes more phase distortion. So it is a trade-off. I have found 512, 1024 and 2048 to be the most useful settings. You should also try the Scaling Function option. This will bring up the Draw Function Window and allow you to have variable pitch shifting with a 12 octave range.

Pitch shifting is one of the slowest processes in SoundHack. With a 33MHz 68040, default settings, and a 44100 sample rate, monaural file as the source, the process time to real time ratio is 290 to 1.
 

4. Time distortion nr. 2

Just as the phase vocoder technique allows pitch shifting without time scaling, it also allows time scaling without pitch shifting. In this example, you with stretch a soundfile to twice its length.

Open a soundfile (command - O). As in the previous example (Pitch shifting), the results will be better for sounds which are not harmonically dense. Type command - P to bring up the Phase Vocoder dialog. This time, click the Time Scale button, and select the Scaling: popup menu. Type "2.0" in the box to the right of the Scaling: popup menu, and click Process to start processing. For an interesting special effect, set Threshold Under Max. (dB): to "-6" before processing. This will allow only a few loud partials to be resynthesized. Phase vocoder time scaling is relatively fast. With a 33MHz 68040, the process time to real time ratio is 69 to 1.

5. Cross-synthesis nr. 1

Cross-synthesis is the combining of two sounds to create a new sound. This is done by analyzing and extracting significant characteristics from the two source soundfiles, then combining these characteristics in the synthesis of the new soundfile. SoundHack has two functions which perform cross-synthesis, convolution and mutation. In this example you will try convolution.

Convolution takes two sounds, analyses them for spectral content, then multiplies the two spectra to create a new sound. This emphasizes frequencies which are held in common and greatly reduces frequencies which are not. For an interesting convolution it is good to start with sounds that have spectra which are neither too similar nor dissimilar.

Open one of the two source soundfiles by typing command - O. Type command - C to bring up the Convolve dialog. Open the second of the source soundfiles by clicking on the Pick Impulse button. Click Moving to have convolution move through the second soundfile, otherwise this process will convolve against a fixed segment of the second soundfile. Click Brighten to avoid over-attentuation of high frequencies. Select Triangle in the Window: popup menu to smooth the convolution. If you prefer an unsmoothed convolution, select Rectangle.

 
Now set the Length Used: number. This is used to designate how much sound gets processed in each block. You can treat this number like the decay time of a reverb, because convolution has a sound which is like a very complex,tuned reverb. A length of 0.1 seconds (100 milliseconds) is a good place to start. For higher settings of Length Used:, the memory requirement increases exponentially.

Now that everything is set, click Process. Because the gain in convolution is unpredictable, save the new soundfile in a format with a large dynamic range. When the Save Soundfile as: dialog appears, select a File Type: of NeXT (.snd) and a File Format: of 32 Bit Floating Point. Click Save and wait.

Once your convolution is done, you will need to convert the new soundfile from a floating point format to something usable (usually 16 bit linear). You will also need to adjust the optimum gain for this conversion. To do this, select the new soundfile, then type command - G. The Gain Change dialog will appear. Click Analyze to find the peak amplitude, then click Change Gain to create a new, normalized soundfile. In the Save Soundfile as: dialog, set the File Format: to 16 Bit Linear.

Convolution is the fastest of SoundHack's functions. For a moving convolution, with a 33MHz 68040, and 44100, monaural soundfiles, the process time to real time ratio is 14 to 1.

6. Noise reduction

As you have seen in the previous examples, much of SoundHack's processing involves the spectral analysis of an input soundfile and subsequent resynthesis of an output soundfile. One very practical use of spectral analysis/resynthesis is noise reduction, where noise components are identified and reduced before resynthesis. A very simple scheme for identifying noise components is to assume that any spectral component with an amplitude below a certain threshold is noise. This works well when the noise is fairly constant, and not too loud (tape hiss, for instance).

Open the soundfile as usual (command - O). Type command - D to bring up the Spectral Dynamics dialog. This is similar to a typical analog multiband dynamics processor except that here there are up to 4096 separate bands.

For noise reduction, set Bands: to 2048. Set Type: to Expand. You could also use Gate for noise reduction, but expansion gives a gentler amplitude reduction and sounds more natural. The Lowest Band: and Highest Band: should be set to 0 and 2048 unless you want to limit the noise reduction to a particular frequency area (as you would in 60 Hz hum removal). Set the Expand Ratio anywhere from 1.5 to 3.0. A higher ratio will be more effective in reducing noise, but may cause artifacts. Set the Threshold Level anywhere from -50 to -90 dB, slightly above the hiss level. If you are too close to the hiss level, the dynamics processor will turn on and off too much and this will cause weird modulation noises. It is best to set the Threshold Level about 9 dB above the hiss level. Click the Use Smoothed Amplitude box. This feature is essential for good hiss removal as it will anitcipate abupt sounds by using an averaged amplitude for threshold detection.

Now click Process and let it go. After SoundHack creates a few seconds of sound, it is probably a good idea to pause processing and play the output soundfile. The parameters for noise reduction usually take a lot of experimentation.

With a 33MHz 68040, and 44100, monaural soundfiles, the process time to real time ratio for spectral dynamics is 70 to 1.

7. Cross-synthesis nr. 2

In this final example, you will use the second form of cross-synthesis in SoundHack, spectral mutation. Mutation measures the spectral change over time in two soundfiles (called the source and target) and resynthesizes a new soundfile (the mutant) through various strategies of combination.

For mutation, you should choose soundfiles that have similar spectral characteristics. When the soundfiles are too different, the mutant seems to just fluctuate between the source and target sounds, without producing many interesting merged sounds. Open one of the soundfiles (the source) with command-O. Type command - M to bring up the Spectral Mutation Function dialog and open the second soundfile (the target) by clicking on the Pick Target button. Set the mutation Type: to LCM/IUIM. The differences between various mutation types are discussed later in this manual (page 21). Click the * Function box so that you can vary the mutation index. For the LCM/IUIM type, an index of 0.0 produces a mutant of all source and an index of 1.0 produces a mutant of all target. Now click Edit Function... to bring up the Draw Function Window.

To keep the mutation index near the center of its range, set the function limits to 0.7 and 0.3. Set the number of cycles to 4.0 and click the sine wave preset icon (see mouse arrow in picture). This will produce a mutant which oscillates slowly from almost the source to almost the target. Click Done in this window and Process in the Spectral Mutation Function dialog. Mutations are very hard to predict, so a lot of experimentation is required before you will get satisfying results. The main factor in the success or failure of a mutation is the choice of source and target soundfiles.