Simple pitch detection in Python with CREPE

Today’s task is to detect the pitch of a sound sample using CREPE in Python.

According to the documentation, we could install it with pip:

pip install --upgrade tensorflow
pip install crepe

Now let’s test CREPE. It can be used

  • via command line, or
  • from inside a Python script.

Let’s start with the “command line” option—which is simpler.

For this, we’ll need a sound to be analysed by CREPE, so it can give us its pitch. We’ll use the recording of plucking the A string of a violin (recording process details).

Now back to CREPE. Calling CREPE from the command line, we expect it to do its default behavior, which is to analyze the audio at intervals of 0.01 s and give a list of the detected pitch for each one of these intervals.

The A string was tuned to 440 Hz, and the recording has approximately 0.20 s of duration, so we expect to see approximately 20 readings with the value 440 Hz.

Running the command

crepe ~/violin-A-pluck.wav

gives the file ~/violin-A-pluck.f0.csv, with the contents in the table below.

time frequency confidence
0.000 436.514 0.554288
0.010 433.468 0.786607
0.020 434.161 0.873717
0.030 434.659 0.909991
0.040 435.399 0.941885
0.050 435.588 0.929651
0.060 436.597 0.918269
0.070 437.151 0.897561
0.080 438.386 0.858296
0.090 438.583 0.814198
0.100 437.535 0.728422
0.110 435.870 0.792426
0.120 434.972 0.855074
0.130 434.905 0.845543
0.140 434.940 0.876631
0.150 435.538 0.876600
0.160 436.607 0.902731
0.170 438.053 0.897667
0.180 438.470 0.877130
0.190 438.663 0.904547
0.200 438.581 0.902713
0.210 437.814 0.925457

We see that the detected frequencies are not exactly 440 Hz. This could be due to several reasons, including

  • the tuner used to tune the string is imperfect
  • the person tuning the string (yours truly) didn’t do a perfect job
  • the recording conditions were not perfect
  • imperfections in the broadcasting, recording,

among several other potential imperfections during the entire process!

But this small simple test shows us that CREPE works on the machine via command line! This was our current objective, so let’s move on…

Now let’s test calling CREPE from inside a Python script. Using the sample code provided by the documentation below with some minor tweaking, we have:

import crepe
from scipy.io import wavfile
from os.path import expanduser
import numpy

sr, audio = wavfile.read(expanduser('~/violin-A-pluck.wav'))
time, frequency, confidence, activation = crepe.predict(audio, sr, viterbi=True)
a = numpy.column_stack((time, frequency, confidence))
numpy.savetxt(expanduser('~/violin-A-pluck.f0-python.csv'), a,
              ['%.3f', '%.3f', '%.6f'],
              header='time,frequency,confidence',delimiter=',')

It gave a second file with the following.

time frequency confidence
0.000 436.514 0.554288
0.010 433.468 0.786607
0.020 434.161 0.873717
0.030 434.659 0.909991
0.040 435.399 0.941885
0.050 435.588 0.929651
0.060 436.597 0.918269
0.070 437.151 0.897561
0.080 438.386 0.858296
0.090 438.583 0.814198
0.100 437.535 0.728422
0.110 435.870 0.792426
0.120 434.972 0.855074
0.130 434.905 0.845543
0.140 434.940 0.876631
0.150 435.538 0.876600
0.160 436.607 0.902731
0.170 438.053 0.897667
0.180 438.470 0.877130
0.190 438.663 0.904547
0.200 438.581 0.902713
0.210 437.814 0.925457

Did we get the same readings as before? To see how it compares to the previous one, we could manually go through the tables cell by cell… or we could subtract one from the other. As we expect CREPE to do the same thing when it is called from the command line and from inside Python, we expect both tables to have the same values. If they are indeed equal, subtracting one from the other will output only zeros. Let’s try that.

from os.path import expanduser
import numpy

a = numpy.loadtxt(open(expanduser('~/violin-A-pluck.f0.csv'), 'rb'),
                  delimiter=',', skiprows=1)
b = numpy.loadtxt(open(expanduser('~/violin-A-pluck.f0-python.csv'), 'rb'),
                  delimiter=',', skiprows=1)
numpy.savetxt(expanduser('~/violin-A-pluck-difference.csv'), a - b,
              ['%.3f', '%.3f', '%.6f'],
              header='time,frequency,confidence',delimiter=',')

Aaand we get…

time frequency confidence
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000
0.000 0.000 0.000000

… all zeros we expected. This suggests that CREPE does the same thing no matter where it is called from (command line or inside the Python script), as expected.

Published by bits4waves

Software in harmony with your melody

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website with WordPress.com
Get started
%d bloggers like this: