| Full Text |
Database of Challenging Musical Sounds for Evaluation
and Refinement of Pitch Estimators
Introduction
Speech researchers have made the most thorough study of the performance
of pitch estimation algorithms. A key to their work is the evaluation of
algorithm performance against standardized databases of speech that have
been "hand" analyzed. Such a database does not exist for musical
signals. As a result, pitch estimation papers in the computer music community
describe algorithms evaluated using short sound examples often chosen to
show new work in the best light. It is thus impossible to predict performance
of published algorithms in real musical situations, and difficult for researchers
to identify fruitful areas for new work. We describe a publicly available
database of musical sound files intended to redress these difficulties.
Musical Sound Database
Sounds in this database can be grouped into two important and hitherto
poorly represented categories:
- Complete musical phrases are used to evaluate the impact of estimation
errors in common and realistic musical contexts.
- Challenging examples areused to identify particular points of weakness
from which an algorithm may suffer. Included are sounds with: pitch synchronous
and additive noise, room ambiance, cross-talk from adjacent strings, ambiguous
octaves, inharmonicity, missing fundamentals, glissandi, vibrato and trills.
Access
The database will be available in early 1998 at http://www.cnmat.berkeley.edu/Research/Pitch.
You will be able to submit your own files to this database by filling in
a form at the site. This form represents a contract that establishes you
as the owner of the rights to the submitted files and granting permission
for their analysis and re-distribution.
AIFF is the chosen format for sound file samples and SDIF
for analyses of these files. The SDIF pitch frame type allows for a weighted
set of pitches facilitating virtual pitches for inharmonic sounds and management
of multiple pitch estimates.
Database Overview
Wind
- Shakuhachi
- Organ Flu Pipes
- Suling
- Didjereedo and Stick
- Clarinet
- Bass Clarinet
Singing
- Indian
- Bel Canto
- Western Popular
- Tibetan
String
For these string sounds a wide range of playing techniques were used
including: open strings, low and high stopped, low and high frequency vibrato,
narrow and wide trill, timbre change, sol ponticello, glissandi, tremelo
near and away from bridge, pizzicato, pizzicato stopped, slow bow change,harmonics,
damped rmonics, hammer on and pull off's, picked, left and right hand damping,
slaps, bottleneck slide and pops.
Brass
Percussion
Analysis
In parallel with the archival activity assembling this database, we are
exploring automatic segmentation and parameter estimation tools to develop
analyses of the sounds against which algorithms may be judged. Early results
using a wavelet technique are very
promising. The wavelet method identifies each pitch period and provides
a "voiced/unvoiced" estimate. Combining this with energy-based
techniques results in good estimations for pitched regions of a phrase.
The estimator is robust with impulsive and continuous noise.
Future Work
- A set of artificially synthesized test signals
- Psychoacoustic experimental harness to develop perceptually solid pitch
estimates
- Objective measures of pitch estimate accuracy.
Acknowledgement
This work assembles materials developed over many years of work with
support from:
- California State Dept. of Commerce
- Zeta Music Inc.
- Gibson Guitar Inc.
- Silicon Graphics Inc.
- Apple Computer Inc.
|