Sound Sampling, Analysis, and Recognition

Signal Recognition YouTube Demonstration

Weve talked a lot about creating sound from sine waves. But cant we just record a sound? YES, we can do that too! I wrote a program that can sample bulk sound data in one recording session.

When we analyze sound data, it is helpful to switch out of the time domain (energy vs time) and switch into the frequency domain (energy vs frequency). Not that time domain doesn’t give important information, but the frequency domain possesses arguably more important information. So my program then automatically analyzes and saves frequency information about each separate sample for later retrieval.

But what is the frequency domain? Its how many times a wave oscillates back and forth per second! A wave could be oscillating at many frequencies at the same time. Kind of like, the ocean waves are going back and forth, but sometimes they have smaller waves rippling, and yet even smaller waves rippling!

We transform to the frequency domain because it gives information about the breaking waves in a concise way, rather than trying to look at the waves behavior in time, which will often make no apparent sense.

And once we are in the concise frequency domain, we can write MATLAB programs to accurately RECOGNIZE sounds. We can distinguish multiple instruments playing the same exact note. We use a few different methods. Common methods of likeness analysis are: dot product score, standard deviation score, average frequency score. We can score according to all combined, or possibly add further layers, like volume attack likeness although I haven't experimented with more than 3 layers at once. Further research would need to be done on how to layer for the most accurate recognition. Currently I use standard deviation alone.

Other ways to use the methods:

What if we weren't trying to measure exact alikeness? What if we wanted to determine whether the C note existed in some complex musical chord from a Bach Song. We would use dot product. This would reward us if the C were to exist in the chord, and better, would NOT punish us if the signals are unlike in ways. Standard deviation would punish us. If you wanted to make a game where a player needed to whistle a note the most in tune, you might run an average frequency score along with a variant of range score over his total turn to determine how much he sucked at whistling.