The Mandelbrot generator has been added to the Imaging Whiteboard. The whole point of the Mandelbrot is infinite complexity from a simple algorithm. The plot points are generated with only 5 lines of code:

                    Complex z = new Complex(0.0, 0.0);
                    Complex c = new Complex(((double)w - MandelbrotSize / 2 + 
                                             HOffsetSliderValue) / (ZoomSliderValue * 64), 
                                            ((double)h - MandelbrotSize / 2 + 
                                             VOffsetSliderValue) / (ZoomSliderValue * 64));
                    uint iteration = 0;
                    for (; iteration < maxIterations && z.Magnitude < 2; iteration++)
                    {
                        z = (z * z) + c;
                    }

The control allows the user to zoom into details of the Mandelbrot and generate details.

The detail shown above is from the very left tip of the full scan. It seems that this detail repeats the full scan, but, not exactly. At first glance the Mandelbrot seems to contain much repetition; unlike fractals these repetitions are not exact.

The full un-zoomed plot looks like:

If you take a 3-dimensional object, rotate it 180 degrees in each of 3 dimensions in turn, the object will return to it’s original position.

I don’t know of an elegant mathematical proof of this; it would require the use of 3D complex numbers which do not exist (see the previous blog post ‘Using complex arithmetic to perform combination warps‘). There is an abundance of empirical evidence though.

Using the warp control of the Imaging Whiteboard we can perform this 3D transformation.

Flip Vertical, Flip Horizontal, and rotate 180 degrees. The image will return to it’s original not warped position.

So if I am using regular 2D complex numbers to perform the warp, how do I perform 3D warping?

To better explain this I have created a proof of concept that implements full horizontal and vertical rotation. This is not published (I’m not sure that it is useful).

After applying the warp factor:

sourceAddress = targetAddress * WarpFactor;

The result is modified:

sourceAddress.Real = sourceAddress.Real  / Math.Cos(fhRadians);  //Horizontal rotation

sourceAddress.Imag = sourceAddress.Imag  / Math.Cos(fvRadians);  //Vertical rotation

Manipulating the real and imaginary components is not really a correct use of complex numbers

Warps are generally backward mapped, that is for each pixel in the target image the address of the required pixel in the source image is calculated. Usually co-ordinate geometry is used to calculate the source address.

In the Imaging Whiteboard complex number arithmetic is used. The code to calculate the source pixel becomes:

sourceAddress = targetAddress * WarpFactor;

Yup that is the code!

The trick is to calculate the warp factor.

No warp is (1, 0), here the source is equal to the target address.

Zoom becomes (zoom, 0).

Rotation becomes (Cos (radians), Sin (radians))

So zoom and rotate becomes (zoom, 0) * (Cos (radians), Sin (radians))

Once the warp factor is calculated the warp is almost trivial!

 

If it’s that easy, why not use a three dimensional complex number to perform a time warp on a captured sequence? That could be fun.

Not so fast, according to the laws of mathematics there is no such thing as a 3 dimensional complex number. 2 sure, 4 OK, even 8.

So, use 4 dimensional complex numbers (quaternions) and ignore one dimension. Turns out that will not work either, mathematics is not so easily fooled.

So I could use co-ordinate geometry, which would be horribly slow and complicated. Or just introduce a DVR control, which is what anybody else would do. The DVR control solution would be limited in its functionality i.e. the same time would have to be applied to all pixels in a frame before the image warp.

None of this is very practical as the memory requirements would be prohibitive.

If you really want to know how to do a time warp you should watch this video.  TimeWarp

The Imaging Whiteboard Users Guide has a screen shot that shows how the threshold control can be used to separate the color components of an image.

Of course once separated the color components can be processed.

This screen shot shows how the FFT filter can be applied to the green component of the checkerboard test pattern.

Note that the FFT result is also shown as green. (Click image for a better view)

 

 

A particularly ugly sound from the synthesizer was used to test profiling.

The following profile was produced. This shows a lot of overtones, one louder than the base note, others almost as loud.

The notes played were CDEF.

In polyphonic mode with no profiling the score transcribed shows that most of the overtones are interpreted as notes played.

 

With no profiling but in monophonic mode the overtones are loud enough that the wrong notes are picked.

Using the correlation profile method we get a lot better, but still with one overtone shown.

Using the rolling adjustment profile method we get an accurate representation of the notes played.

Note that all of these bars were produced with the same samples reprocessed with different options.

Drilling down to take a closer look at the second note D

The captured samples:

samples

 

The FFT result shows an abundance of overtones. This is consistent with the profile.

fftresult

 

The note mapping will interpret these frequencies as notes. We can see that 2 of the overtones have higher scores than the played note (D = 17)

notes

The profiler output:

Only after this step does the played note emerge.

profiler

It is unlikely that any acoustic instrument will be nearly this difficult. Only by stressing the algorithms during testing can we be confident that we can provide a high degree of accuracy in realistic situations.

A voice with a long sustain was used to test the sustain compensation feature.

Notes played were ECDE.

Without sustain compensation we can see that each note carried over to the next.

With sustain compensation enabled the notes played were accurately scored.

Drilling down to take a closer look at the second note C:

The FFT results show the first peak on C, but also a residual peak on E.

Looking at the note values without sustain compensation enabled we can see that both C and E score enough to be scored.

Reprocessing the same samples with sustain compensation enabled we can see that the value given to the residual E (E = 19) is reduced.

The beta sign-up page on the web site is up. The first of the beta testers are signing up.

Time to think about Beta 2. I’ll continue to stress the algorithms – accuracy is everything. And of course respond to feed back from the beta 1 testers.

In addition a new tool is under development; a score editor. This will not be a full blown score editor, there are enough of those on the market. It will allow users to make modifications and corrections to the transposed score.

keyboardSound Analysis recognizes that in order to obtain the best possible accuracy extensive algorithm testing is essential.

To this end each pipeline component was rigorously tested using voices that were selected to stress that particular component.

The synthesizer selected to generate the tests was an M-Audio Venom http://www.m-audio.com/products/en_us/Venom.html . This was selected as it has a wide range of octaves, and can produce some really extreme sounds. During the final stages of testing this choice really paid off.

After each algorithm adjustment, regression testing was performed on a more reasonable set of voices. This was to ensure that we were not tuning just for extreme situations, but for all possibilities.

Two specific examples of stress testing included here are for sustain compensation and profiling. We could have cherry picked some easy targets, but these were a couple of the most extreme.

There is only so much that can be done in the lab. The next stage will be to beta test for as many different instruments as possible.  Further tuning will occur as we receive feedback from the beta testers.