I'm working on a simple web app that allows the user to tune his/her guitar. I'm a real beginner in signal processing, so don't judge too hard if my question is inappropriate.

So, I managed to get the fundamental frequency using a FFT algorithm and at this point the application is somehow functional. However, there is room for improvement, right now I send raw pcm to the FFT algorithm, but I was thinking that maybe there are some pre/post algorithms/filters that may improve the detection. Can you suggest any?

My main problem is that when it detects a certain frequency it shows that frequency for 1-2sec and then jumps to other random frequencies and comes back again and so on, even if the sound is continuous.

I'm also interested in any other type of optimization if one has experience with such things.

up vote 17 down vote accepted

I'm guessing the other frequencies it gets are harmonics of the fundamental? Like you're playing 100 Hz and it picks out 200 Hz or 300 Hz instead? First, you should limit your search space to the frequencies that a guitar is likely to be. Find the highest fundamental you're likely to need and limit to that.

Autocorrelation will work better than FFT at finding the fundamental, if the fundamental is lower in amplitude than the harmonics (or missing altogether, but that's not an issue with guitar):

enter image description here

You can also try weighting the lower frequencies to emphasize the fundamental and minimize harmonics, or use a peak-picking algorithm like this and then just choose the lowest in frequency.

Also, you should be windowing your signal before applying the FFT. You just multiply it by a window function, which tapers off the beginning and end of the waveform to make the frequency spectrum cleaner. Then you get tall narrow spikes for frequency components instead of broad ones.

You can also use interpolation to get a more accurate peak. Take the log of the spectrum, then fit a parabola to the peak and the two neighboring points, and find the parabola's true peak. You might not need this much accuracy, though.

Here is my example Python code for all of this.

  • This is what I was looking for, very good answer, thank you! – Valentin Radu Oct 13 '11 at 13:54
  • 2
    Multiplying by a window function that is tapered will actually smear out any spectral lines in your signal, thereby making them broader. What it can buy you, though, is dynamic range, allowing you to identify, for example, a very low-power spectral line in the presence of a high-power interfering tone. – Jason R Oct 13 '11 at 14:10
  • @JasonR given the fact that this is designed to work in an environment in which the probability of a high-power interfering tone(s) is really low, do you suggest that is better not to use a Hamming window? – Valentin Radu Oct 13 '11 at 14:32
  • 1
    I can confirm that using a Hamming window got me closer to my goal of keeping the readings steady. Right now, when I play an A4 I get 440 Hz most of the time and only very rare I get a close reading like 650 Hz or so. I'm guessing those are harmonics? Also, I couldn't help notice that for higher frequency the app works flawless and that for lower it starts to fail. Probably because I'm using FTT to detect the peak magnitude frequency bin and for lower frequencies thats not always the fundamental? – Valentin Radu Oct 13 '11 at 15:53
  • 1
    @mindnoise: 660 Hz is not a harmonic of 440 Hz, but it is a harmonic of 220 Hz, or a perfect fifth above 440. Could be another string resonating or distortion or something? It's a lot easier to figure out issues like this if you can plot the FFT and look at it. Yes, the low frequencies might be filtered and reduced relative to the higher ones, either by mechanical effects or by your analog circuitry. – endolith Oct 13 '11 at 15:56

Pitch is not the same as peak magnitude frequency bin of an FFT. Pitch is a human psycho-acoustic phenomena. The pitch sound could have a missing or very weak fundamental (common in some voice, piano and guitar sounds) and/or lots of powerful overtones in its spectrum that overwhelm the pitch frequency (but still be heard as that pitch note by a human). So any FFT peak frequency detector (even including some windowing and interpolation) will not be a robust method of pitch estimation.

This stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

ADDED: If you are doing this for guitar sounds, note that the lowest guitar strings may actually produce slightly inharmonic overtones, making pitch estimation even more difficult, as the human ear may hear a pitch frequency more closely related to sub-multiples of the overtones, rather than to the actual fundamental vibration frequency of the string.

ADDED #2: This gets asked so often that I wrote up a longer blog post on the topic: http://www.musingpaw.com/2012/04/musical-pitch-is-not-just-fft-frequency.html

  • just visited (and commented in) the blog that you just referred us to. – robert bristow-johnson Mar 14 '16 at 4:43

I spent many years researching pitch detection on polyphonic music -- like detecting the notes of a guitar solo within a mp3 recording. I also wrote a section on Wikipedia which gives a brief description of the process (look at the "Pitch detection" subsection in link below).

When a single key is pressed upon a piano, what we hear is not just one frequency of sound vibration, but a composite of multiple sound vibrations occurring at different mathematically related frequencies. The elements of this composite of vibrations at differing frequencies are referred to as harmonics or partials. For instance, if we press the Middle C key on the piano, the individual frequencies of the composite's harmonics will start at 261.6 Hz as the fundamental frequency, 523 Hz would be the 2nd Harmonic, 785 Hz would be the 3rd Harmonic, 1046 Hz would be the 4th Harmonic, etc. The later harmonics are integer multiples of the fundamental frequency, 261.6 Hz ( ex: 2 x 261.6 = 523, 3 x 261.6 = 785, 4 x 261.6 = 1046 ).

I use a modified DFT Logarithmic Transform to first detect the possible harmonics by looking for frequencies with peak levels (see diagram below). Because of the way that I gather data for my modified Log DFT, I do NOT have to apply a Windowing Function to the signal, nor do add and overlap. And I have created the DFT so its frequency channels are logarithmically located in order to directly align with the frequencies where harmonics are created by the notes on a guitar, saxophone, etc.

Now being retired, I have decided to release the source code for my pitch detection engine within a free demonstration app called PitchScope Player. PitchScope Player is available on the web, and you could download the executable for Windows to see my algorithm at work on a mp3 file of your choosing. The below link to GitHub.com will lead you to my full source code where you can view how I detect the harmonics with a custom Logarithmic DFT transform, and then look for partials (harmonics) whose frequencies satisfy the correct integer relationship which defines a 'pitch'.

My Pitch Detection Algorithm is actually a two stage process: a) First the ScalePitch is detected ('ScalePitch' has 12 possible pitch values: {E, F, F#, G, G#, A, A#, B, C, C#, D, D#} ) b) and after ScalePitch is determined, then the Octave is calculated by examining all the harmonics for the 4 possible Octave-Candidate notes. The algorithm is designed to detect the most dominant pitch (a musical note) at any given moment in time within a polyphonic MP3 file. That usually corresponds to the notes of an instrumental solo. Those interested in the C++ source code for my 2 Stage Pitch Detection algorithm might want to start at the Estimate_ScalePitch() function within the SPitchCalc.cpp file at GitHub.com.

https://github.com/CreativeDetectors/PitchScope_Player

https://en.wikipedia.org/wiki/Transcription_(music)#Pitch_detection

Below is the image of a Logarithmic DFT (created by my C++ software) for 3 seconds of a guitar solo on a polyphonic mp3 recording. It shows how the harmonics appear for individual notes on a guitar, while playing a solo. For each note on this Logarithmic DFT we can see its multiple harmonics extending vertically, because each harmonic will have the same time-width. After the Octave of the note is determined, then we know the frequency of the Fundamental.

enter image description here

The diagram below demonstrates the Octave Detection algorithm which I developed to pick the correct Octave-Candidate note (that is, the correct Fundamental), once the ScalePitch for that note has been determined. Those wishing to see that method in C++ should go to the Calc_Best_Octave_Candidate() function inside the file called FundCandidCalcer.cpp, which is contained in my source code at GitHub.

enter image description here

  • James, does your DFT pitch detector detect notes with a missing (or weak) fundamental? – robert bristow-johnson Jul 6 '16 at 0:28
  • Yes, my 2 Stage Pitch Detection algorithm will detect notes, even if the signal has a "missing (or weak) fundamental" -- that is a big strength of this 2 stage process. The Fundamental is determined in the second stage when Octave Detection is performed on the time-widths that you see for notes on the Logarithmic DFT diagram. Since this Pitch Detection function works within the confusion of a polyphonic mp3 signal, it will detect notes that are missing many harmonics, including the Fundamental. I have just added to this Answer a second diagram which explains my Octave Detection algorithm. – James Paul Millard Jul 6 '16 at 1:05

Your Answer

 
discard

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Not the answer you're looking for? Browse other questions tagged or ask your own question.