Building Spectro: a Real-Time WebGL audio spectrogram visualizer

Lichtso · 7 months ago

Also try using the Wavelet-Transform instead of short time FFT (with overlapping windows).

It is easier to configure (less parameters, there is no need for a window function), offers more flexibility (exponential frequency band; e.g. for music scales) and can reach the Gabor-Heisenberg uncertainty limit without artifacts.

The only downside is that you need to know the entire signal in advance, so it can only be used for recordings.

Shameless self-promo of my implementation: https://github.com/Lichtso/CCWT

afgho · 7 months ago

Shameless cross-promo; ported your implementation to JS https://github.com/grz0zrg/ccwt.js

jimbo1qaz · 7 months ago

Your repo and tutorial are really cool! But do you have any sort of interactive "out-of-the-box" demo that doesn't require me to write code to call the library (online or downloadable)?

Lichtso · 7 months ago

I made a WASM + WebGL port some time ago, similar to this one here. I can polish and upload it to GitHub in the next days.

IshKebab · 7 months ago

You don't need to know the entire signal in advance - you can just window the wavelets at some reasonable size.

Lichtso · 7 months ago

Yes, but then you are back at time-windows, a window-function, overlap and artifacts, which is defeating the purpose.

jimbo1qaz · 7 months ago

I'm more familiar with Fourier transforms and have limited experience with wavelets. But if each wavelet intrinsically falls off at a Gaussian curve, cutting it off (possibly with a window) at 3-4 sigmas won't change the wavelet substantially. Maybe for some use cases, the wavelet will be narrower at high frequencies (short delay), and wider at low frequencies (high delay). I don't know how you'd perform incremental updates of a plot drawn with non-uniform delays though...

Lichtso · 7 months ago

The signal will continue over the seam between two windows, meaning you will cut the wave in the signal "in half". Mathematically, waves are always infinite and to cut them you would actually introduce overtones (higher frequencies) to model the sharp end / start of the base wave. These then result in artifacts regardless of what method is used for the transformation (Fourier or Wavelet).

jimbo1qaz · 7 months ago

> The signal will continue over the seam between two windows, meaning you will cut the wave in the signal "in half".

You can window the wavelet, then slide the finite-duration wavelet by a few samples at a time, even if the wavelet is hundreds to thousands of samples long. This is possible in STFT as well (each part of the original signal shows up in many separate FFTs).

Again, I don't know the implementation details of wavelet transforms. Maybe I'll look into your repo when I have time. What's your asymptotic and practical runtime?

Lichtso · 7 months ago

O(n log n) for the x axis (time samples) and O(n) for the y axis (frequencies).

But you can downsample the signal in frequency domain, meaning you will pay mostly for the output resolution.

mattkrause · 7 months ago

You often want a wavelet with compact support anyway so you don't even need to choose the window size: it's built-in.

npollock · 7 months ago

the demo is fun to try: https://calebj0seph.github.io/spectro/

skykooler · 7 months ago

The "Record from microphone" option doesn't seem to work in Firefox for some reason; it just spins a loading icon endlessly.

calebj0seph · 7 months ago

Very strange! Did you get the permission prompt from Firefox after it started spinning? If not you might have denied access to the microphone in all sites which is why the prompt wouldn't come up.

fenwick67 · 7 months ago

Works for me on mobile and desktop FF

davidy123 · 7 months ago

Record from mic works for me. Listening to https://www.youtube.com/watch?v=FATTzbm78cc in one window with mic recording does the expected at the end of the song — https://www.magneticmag.com/2012/08/the-aphex-face-visualizi...

calebj0seph · 7 months ago

Cyberdemon from the DOOM soundtrack is another fun track to put through a spectrogram.

xchip · 7 months ago

Here is another spectrogram visualizer but with a twist, the frequency bins are the notes of a piano and hence you can use it to tune instruments or your voice.

The project: https://github.com/aguaviva/GuitarTuner

Online demo: https://aguaviva.github.io/GuitarTuner/GuitarTuner.html

zeroxfe · 7 months ago

Just FYI, that's not a spectrogram, it's a frequency histogram. :-)

GistNoesis · 7 months ago

Little plug of something similar I developed one year ago : Wisteria : https://gistnoesis.github.io/ It does the real-time spectrogram using tensorflow.js with gpu. And it also run some transformer neural networks real-time to transcript the notes into a piano-roll.

emmanueloga_ · 7 months ago

Looks really cool. Sounds like a similar approach could be used to render audio waveforms. I wonder why a project like this one [1] decided to use server side waveform generation instead.

1: https://waveform.prototyping.bbc.co.uk/

calebj0seph · 7 months ago

I had a look into peaks.js - it looks like it supports both server and client-side waveform generation these days. Server-side generation still makes sense in some cases imo - like if you have a very long audio file such as a podcast that you don't want users to download the entirety of just to display a waveform.

est31 · 7 months ago

Awesome, I love spectograms!

Why did you implement your own FFT instead of using WebAudio?

http://arc.id.au/Spectrogram.html

calebj0seph · 7 months ago

Hey, cool demo and article! You clearly have more experience with DSP than me haha

I was considering using an AnalyserNode since it's implemented natively by the browser and therefore a lot faster than using a FFT implementation in Javascript. My biggest issue with AnalyserNode though is that there's no way to control the window function or overlap amount between windows. While I'm sure you could make a decent spectrogram with an AnalyserNode (as you've done!), I think implementing the FFT yourself lets you do more fine-tuning.

When I get some time I might make Spectro use a Wasm FFT implementation like PulseFFT (https://github.com/AWSM-WASM/PulseFFT) for better performance. At the moment I'm using jsfft (https://github.com/dntj/jsfft) inside a web worker, which definitely isn't as efficient as a native implementation.

est31 · 7 months ago

It's not my code btw, only found the post on the internet. Your points about the restrictions of AnalyserNode make sense. A wasm solution is indeed the ideal way to solve it if you want full flexibility.

eg312 · 7 months ago

Very cool! Are you the author?

shakes · 7 months ago

I am not. I just found it and thought it was super interesting.

Caleb Joseph is the original author: https://github.com/calebj0seph

calebj0seph · 7 months ago

Thanks for posting to HN, blown away by how much interest there's been!