History of Sound Analysis & The Musician’s Workbench

21 March 1768 – 16 May 1830
Jean Baptiste Joseph Fourier shows that it is mathematically possible to transform a time domain series into a frequency domain series. He also deduced the greenhouse effect mathematically, way ahead of his time.

1965
J. W. Cooley and J. W. Tukey publish their paper on the FFT, which shows how a Fourier transform can be efficiently executed on a computing device.

November 1983.
At a party in fashionable Ealing (a west London suburb) an anonymous reveler asks if it would be possible to print out a music score as an instrument was played. At the time nobody had heard of the internet, PCs were in their infancy, and the idea was duly ridiculed.

However there were some technical advances newly available. AMD had introduced the Am29500 bit slice family of devices aimed specifically at performing FFTs. National semiconductor had introduced switched capacitor filters which would allow an anti-aliasing filter to be implemented with a dynamically variable cutoff frequency. Motorola had introduce the MC68010 2nd generation 16/32 bit microprocessor with a much improved multiply execution time.

1984
The SA-10 is hardware designed.

1985
The SA-10 is built and the assembly and microcode is written.

December 1985
The SA-10 prototype is fully operational.

1986
Sound Analysis the company is conceived. The SA-10 road show begins its tour of venture capitalists. The dot com boom was more than 10 years away, and VCs were not jumping at anything technical. The idea was killed due to lack of funding.

2013
30 years on and there is still no product on the market that can perform polyphonic transcription of any live instrument over 3 octaves in real time. There are requests for such a thing on newsgroups; the request is usually met with the response that such a thing is not possible.

There were some technical advances that indicate that a pure software solution might be possible. Powerful multi-core PCs were inexpensive; Microsoft had released the Task Parallel Library as part of the .NET framework to take advantage of multiple cores. The Microsoft Media Foundation library would allow the kind of low level audio access required.

November 2013
The Musician’s Workbench is prototyped and easily reproduces the original SA-10 functionality on a 4 core desktop PC.

The SA-10

The SA-10 was a device that could perform polyphonic transcription of any live instrument over 3 octaves in real time. It was built around the Am29500 bit slice which performed the FFT, and an MC68010 which performed the control and frequency interpretation functions. The device also featured some bespoke analog hardware for audio acquisition. Printout was performed by an Epson dot matrix printer which allowed direct control of the individual strike pins.

This was doing it the hard way; the FFT was written in microcode (yes – zeros and ones), other code was written in 68000 assembler, the graphics was written one pixel at a time, and the UI design involved sheet metal work.

The SA-10 did establish the feasibility of such an approach, and established the principles that inspired much of the design of the Musician’s Workbench.

The Musician’s Workbench

The Musician’s workbench is a desktop application that provides three resizable panes into which various applications can be dragged. Applications communicate with each other. In the background the signal processing pipeline executes requests, and signals results. The forth left hand pane contains the application store, the metronome and an options panel. On closing the workbench the application and options configurations are saved, to be re-instated when the workbench is re-opened. The applications available will increase over time as more are developed.

The basic workbench setup would be with the ‘Audio Input’, and ‘Score Viewer’ applications open. The user would select an audio input; select an octave range, fine tune for accuracy (the tuning will be saved, so this just happens once). Then using the metronome, select a time signature; alter the metronome speed to taste. Then just play, the played score will appear in the ‘Score Viewer’.

There are many score editors, on the market, similarly there are good audio analysis tools. The Musician’s Workbench is not intended to reproduce the functionality of either of these types of products, but, rather to bridge the gap. So a highly functional score editor application capable of producing publishable scores is not on the horizon. There will be applications that will allow the input and export of commonly used formats such as MIDI, and MusicXML, so that the Musician’s Workbench is compatible with industry standards. A musician will be able to rapidly produce a score on the Musician’s Workbench, then export this score to any of a number of professional score editors for finalizing.

Applications will be developed to fill gaps in the market, for example there will be a diagnostic tool that will allow deep analysis of a played score. The user will be able to select a bar and a beat, view the sampled audio, the frequency spectrum and the note mapping implemented for the beat.

Some applications will be aimed at enhancing the quality of the results, such as an application that will allow the user to teach the Musician’s Workbench the acoustic fingerprint for his instrument, this will greatly reduce the incidence of spurious harmonics during polyphonic transcription.

There have been some improvements that have been made with the Musician’s Workbench, not only is it now possible to continuously add new features and applications (upgrades to the SA-10 would have involved physically changing firmware), but, there have been some algorithm improvements. The instrument fingerprinting was mentioned in the previous paragraph. The frequency to note mapping is now dynamically adaptive, reducing the incidence of spurious overtones, and also making the Musician’s Workbench more tolerant of poor tuning.

Any instrument can be used including voice. The Musician’s Workbench will be merciless in transcribing the notes actually sung. It will not transcribe the notes you thought you sang. Sound Analysis is not responsible for bruised egos.