Hacker News new | past | comments | ask | show | jobs | submit login
Building Spectro: a Real-Time WebGL audio spectrogram visualizer (github.com/calebj0seph)
123 points by shakes 7 months ago | hide | past | favorite | 28 comments



Also try using the Wavelet-Transform instead of short time FFT (with overlapping windows).

It is easier to configure (less parameters, there is no need for a window function), offers more flexibility (exponential frequency band; e.g. for music scales) and can reach the Gabor-Heisenberg uncertainty limit without artifacts.

The only downside is that you need to know the entire signal in advance, so it can only be used for recordings.

Shameless self-promo of my implementation: https://github.com/Lichtso/CCWT


Shameless cross-promo; ported your implementation to JS https://github.com/grz0zrg/ccwt.js


Your repo and tutorial are really cool! But do you have any sort of interactive "out-of-the-box" demo that doesn't require me to write code to call the library (online or downloadable)?


I made a WASM + WebGL port some time ago, similar to this one here. I can polish and upload it to GitHub in the next days.


You don't need to know the entire signal in advance - you can just window the wavelets at some reasonable size.


Yes, but then you are back at time-windows, a window-function, overlap and artifacts, which is defeating the purpose.


I'm more familiar with Fourier transforms and have limited experience with wavelets. But if each wavelet intrinsically falls off at a Gaussian curve, cutting it off (possibly with a window) at 3-4 sigmas won't change the wavelet substantially. Maybe for some use cases, the wavelet will be narrower at high frequencies (short delay), and wider at low frequencies (high delay). I don't know how you'd perform incremental updates of a plot drawn with non-uniform delays though...


The signal will continue over the seam between two windows, meaning you will cut the wave in the signal "in half". Mathematically, waves are always infinite and to cut them you would actually introduce overtones (higher frequencies) to model the sharp end / start of the base wave. These then result in artifacts regardless of what method is used for the transformation (Fourier or Wavelet).


> The signal will continue over the seam between two windows, meaning you will cut the wave in the signal "in half".

You can window the wavelet, then slide the finite-duration wavelet by a few samples at a time, even if the wavelet is hundreds to thousands of samples long. This is possible in STFT as well (each part of the original signal shows up in many separate FFTs).

Again, I don't know the implementation details of wavelet transforms. Maybe I'll look into your repo when I have time. What's your asymptotic and practical runtime?


O(n log n) for the x axis (time samples) and O(n) for the y axis (frequencies).

But you can downsample the signal in frequency domain, meaning you will pay mostly for the output resolution.


You often want a wavelet with compact support anyway so you don't even need to choose the window size: it's built-in.


the demo is fun to try: https://calebj0seph.github.io/spectro/


The "Record from microphone" option doesn't seem to work in Firefox for some reason; it just spins a loading icon endlessly.


Very strange! Did you get the permission prompt from Firefox after it started spinning? If not you might have denied access to the microphone in all sites which is why the prompt wouldn't come up.


Works for me on mobile and desktop FF


Record from mic works for me. Listening to https://www.youtube.com/watch?v=FATTzbm78cc in one window with mic recording does the expected at the end of the song — https://www.magneticmag.com/2012/08/the-aphex-face-visualizi...


Cyberdemon from the DOOM soundtrack is another fun track to put through a spectrogram.


Here is another spectrogram visualizer but with a twist, the frequency bins are the notes of a piano and hence you can use it to tune instruments or your voice.

The project: https://github.com/aguaviva/GuitarTuner

Online demo: https://aguaviva.github.io/GuitarTuner/GuitarTuner.html


Just FYI, that's not a spectrogram, it's a frequency histogram. :-)


Little plug of something similar I developed one year ago : Wisteria : https://gistnoesis.github.io/ It does the real-time spectrogram using tensorflow.js with gpu. And it also run some transformer neural networks real-time to transcript the notes into a piano-roll.


Looks really cool. Sounds like a similar approach could be used to render audio waveforms. I wonder why a project like this one [1] decided to use server side waveform generation instead.

1: https://waveform.prototyping.bbc.co.uk/


I had a look into peaks.js - it looks like it supports both server and client-side waveform generation these days. Server-side generation still makes sense in some cases imo - like if you have a very long audio file such as a podcast that you don't want users to download the entirety of just to display a waveform.


Awesome, I love spectograms!

Why did you implement your own FFT instead of using WebAudio?

http://arc.id.au/Spectrogram.html


Hey, cool demo and article! You clearly have more experience with DSP than me haha

I was considering using an AnalyserNode since it's implemented natively by the browser and therefore a lot faster than using a FFT implementation in Javascript. My biggest issue with AnalyserNode though is that there's no way to control the window function or overlap amount between windows. While I'm sure you could make a decent spectrogram with an AnalyserNode (as you've done!), I think implementing the FFT yourself lets you do more fine-tuning.

When I get some time I might make Spectro use a Wasm FFT implementation like PulseFFT (https://github.com/AWSM-WASM/PulseFFT) for better performance. At the moment I'm using jsfft (https://github.com/dntj/jsfft) inside a web worker, which definitely isn't as efficient as a native implementation.


It's not my code btw, only found the post on the internet. Your points about the restrictions of AnalyserNode make sense. A wasm solution is indeed the ideal way to solve it if you want full flexibility.


Very cool! Are you the author?


I am not. I just found it and thought it was super interesting.

Caleb Joseph is the original author: https://github.com/calebj0seph


Thanks for posting to HN, blown away by how much interest there's been!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: