1. F0

The first of the sections corresponds to F0. In this section are parameters for F0 estimation. This is important because VoiceSauce uses pitch tracks to estimate harmonic locations.

By default, harmonic locations are estimated using the Straight pitch track. VoiceSauce can also calculate these locations using the Snack, Praat or SHR pitch track, or through another pitch tracking algorithm (Other). This is done by selecting the respective radio button.

It is possible to control the parameters used in the pitch tracking algorithms by looking at the corresponding sub-sections:

For Straight and Snack, the Max F0 (Hz): field specifies an upper F0 estimation limit, and the Min F0 (Hz): field specifies a lower F0 estimation limit. Candidates not within the range of these two frequencies will be ignored. The Straight section also allows the user to alter the Max duration (s):, which controls how much of a sound file Straight processes at a single time. This is needed because Straight uses a lot of memory, and for extremely long files, it is possible for Matlab to run out of memory. So, if a file was 60 s long, setting Max duration to 20 s would mean the first 20 s get processed, followed by the next 20 s, and so on. The entire file is processed, but in three chunks of 20 s.

Newer versions of VoiceSauce allow for parameter estimation through the Praat pitch track. Access to Praat's parameters is gained by clicking the Settings button. The user is then presented with a window containing numerous parameters that can be changed. These parameters control the recruitment of candidates.

Max F0 (Hz): Upper F0 Estimation Limit
Min F0 (Hz): Lower F0 Estimation Limit
Voice Threshold: Value until which to accept unvoiced candidates. Increasing this value increases the number of unvoiced candidates.
Silence Threshold: Amplitude at which to accept candidates. Values below this threshold will be marked as silent and will be ignored.
Octave Cost: Tolerance for high-frequency candidates. Increasing this value increases favoring in recruitment of high-frequency candidates.
Voiced Unvoiced Cost: Tolerance for voiced/unvoiced transitions. Lower value means higher tolerance. Decreasing this value increases the number of voiced/unvoiced transitions.
Kill Octave Jumps: Checking this box changes every pitch jump into one that is smaller than half an octave. This is done by adding or subtracting one or more octaves.
Smooth: Checking this box smooths the F0 track generated by Praat.
Interpolate: Checking this box will allow Praat to interpolate the F0 track.

For a more in-depth look at how these parameters affect the pitch tracking, please refer to Praat's manual.

Please look at the Other section of this manual for a guide to using user-defined algorithms for calculating F0.

2. Formants and Bandwidths

The second section is the Formants and bandwidths section, which controls the parameters used for formant analysis and correction.

As with F0 and harmonics, the user can control which algorithm will be used in formant analysis. Here, there are three options – the Snack algorithm, Praat's formant analysis, or a user-defined algorithm (Other). This can be changed by clicking on the respective radio button.

The Snack sub-section gives the user control over the Pre-emphasis value used by Snack. By default, this is set to 0.96.

The Praat sub-section gives the user control over the frequency range and number of formants to be found in it. (Currently this number must be an integer; the next version of VoiceSauce should include non-integer options.)

The Bandwidth toggle controls what estimate of the formant bandwidths is used in correcting the harmonic amplitudes for formant influences. By default, this is "Use formula values", meaning that the bandwidths are estimatedfrom their frequencies by a formula from Hawks & Miller (1995). If toggled to "Use estimated values", then the values calculated by VoiceSauce will be used - either from Snack or from Praat, whichever has been selected for formant analysis.

Please look at the Other section at the bottom of this manual for a guide to using user-defined algorithms for calculating formants.

3. Common

The common section can be found in the middle, to the right of the formants section. This section contains miscellaneous options that are used by many parts of VoiceSauce.

Window size (ms) controls the window size used by Snack, which affects the f0 estimates and formant estimates.
Frame shift (ms) controls how often a measurement is made. By default, VoiceSauce makes a measurement at every millisecond.
Not a number label replaces Matlab's NAN (Not a number) label in output files with 0, or whatever text the user specifies.
Recurse sub-directories commands VoiceSauce to look inside sub-directories when processing a batch of files.
Link mat directories and Link wav directories lets VoiceSauce assume all necessary file types are in the same directory.
No. of periods for harmonic estimation
No. of periods for energy, CPP, and HNR estimation

4. SHR

This small section contains parameters for subharmonic to harmonic ratio calculations.

Max F0 (Hz) defines the pitch ceiling until which to accept candidate data
Min F0 (Hz) defines the pitch floor above which candidate data is accepted
Threshold

5. Textgrid

The Textgrid section tells VoiceSauce where to look for segment labels in associated textgrids.

Ignore these labels: VoiceSauce will ignore any of the labels specified in this field. Labels that should be ignored should be written between quotes. Omitting labeled intervals which aren't needed, such as silent pauses, can greatly speed up processing.
Tier numbers: tells VoiceSauce which textgrid tiers to look in for segment labels. Multiple tiers can be specified by separating tier names with commas.

6. EGG Data

The EGG Data section lists the options relevant for uploading EGG data into VoiceSauce.
VoiceSauce will search through the EGG data files when creating output files.

Headers to search for: specifies the header labels VoiceSauce imports from EGG data
Time label: tells VoiceSauce the name of the EGG data file column associated with time.

7. Miscellaneous Settings

The Outputs section has only one option – Smoothing window size, which controls the width of the smoothing window.

The Input (wav) files section specifies the file format name. VoiceSauce can only analyze wav files, but non-PC users may need to specify, if their system is case-sensitive, *.wav or *.WAV.

8. Applying, Saving, and Loading Settings

VoiceSauce allows users to save and load modified settings for easier access in the future. This is useful in eliminating the need to retype in the parameters each time VoiceSauce is started.

To do this, click on the Settings menu button at the top left of the Settings window. To save settings, choose Save... and specify the path and name for the settings file to save. To load settings, choose Load... and specify the path and the name of the file to settings file to load.

To apply the changes to the parameters and to close the Settings window, simply click the OK button at the bottom.

Other

The Other panels of the F0 and Formants sections are typically grayed out. They can be activated (and then selected) if the Enabled box if checked. Selecting Other tells VoiceSauce to calculate data using an external program. These programs can use either the same algorithms that are currently used by VoiceSauce, or new algorithms not found in VoiceSauce.

Using an external program for data analysis requires a command line to be entered in the Command field. For example, suppose you want to use a new F0 program called F0.exe which takes in a .wav (myvoice.wav) along with a bunch of parameters and outputs a results file (output.txt). To run this program, you would usually type:

VoiceSauce operates in batch mode, capable of operating on a large number of files at once. Instead of typing this line over and over again, VoiceSauce simplifies this process by plugging in the target files into each iteration of this command for you. It therefore needs to know where in the command line to plug in the wavefiles and results files. This is done by using the placeholders $wavfile and $outfile. The command to type in the Command field in this example would be:

myF0.exe $wavfile -win 25 -frame 10 -parameter1 red $outfile

The Offset (ms) field tells VoiceSauce where in the sound file to begin the user specified algorithm.

Settings