Author Archives:

Project 2: ColorSynth

I began project 2 by wanting to look at the connection between visuals and sound — specifically in terms of themes and colors. My first concept was to use an API to get keywords out of an image and play a corresponding audio file that would be altered based on the specifics of the image. My coding knowledge and experience made this extremely difficult so I went a path that was more in scope for me. The project I have ended up with is the ColorSynth. The inspiration for the ColorSynth came from Sean Scully’s “Landline” which is essentially taking images and boiling them down to fewer than 20 pixels tall and one pixel wide and painting the resulting color bands with acrylic on aluminum. I took this simple idea of boiling down picture (whether it be static or motion) to a few stripes and playing it. There are many directions that this concept could have gone and this is one of them. In this iteration of the ColorSynth, there are 3 modes: Swatch (or Color Picker), Stripes, and Camera. The most simple — Swatch — allows you to select a color. The synthesizer will then mix between the three sources: red, green, and blue. Each of the sounds associated with these colors are meant to “feel” similar to that color. There is also a delay effect unit included that can be manually controlled when in the swatch mode. When switched to Stipes mode, the camera appears on the screen, but in the stripes aforementioned. By changing the speed, the synth will scroll through each individual stripe with some slide effecting the amplitude of each color and the effect section. If “Force Manual” is on, then the effects unit will ignore incoming information and be just like Swatch mode. Finally, there is Camera mode which is similar to Stripes, except that we now see the entire camera and the synth information scrolls horizontally and vertically based on the speed. If there is too much gain coming from the Synth, the output will clip and be lowered. If it is lowered too much, reset the gain with the button. You can also manually change the camera dimensions.

Project 1: Baad

The concept for this project was inspired by Rational Acoustic’s Smaart, which is a software built to assist in normalizing loudspeaker systems. From the name, Baad, you might be able to tell how it went. From the beginning, I have had the concept, but the execution just never got there. The main feature that I was going for was the ability to view a spectroscope showing the difference between the output from the program and the input coming into the program from a microphone after having gone through the loudspeaker system. In order to reach this a few other points needed to be met. First, calculating the delay time from the point the audio leaves the software to the time it returns via the sound system. Second, averaging the amplitudes of individual frequencies over some period of time to both smooth the information and to calculate the similarities between the values giving what is known as the “confidence.” Finally, doing octave smoothing to make the amplitudes more readable for the purpose of applying an EQ to the information. 

Here is where the issues came in. In its most simple state, what I am looking for is the spectral analysis of the impulse response of the audio system. I began with putting the direct and received signals into a pfft~, doing a cartopol~ and subtracting the amplitudes, but this did not give me anything close to what I was looking for. After a few different variations to this, what I found gave me what I wanted was dividing the real numbers of both and the imaginary numbers of both and then put that into an ifft~ to give me the impulse response signal. Great. I’ve got that part. Now I need to plot it. I found the vectral~ object which is meant to plot FFT’s, great. The issue is that when I plot it, it doesn’t give me the information that I want. At this point, I don’t think I ever fully accomplished the first step. Once I got here I played around with the console’s EQ to see if my plot would react and it did not. On the way to the vectral~ object I ended up in jitter world for a while, seeing if I could create a matrix with 512 columns and 83 rows, then using a jit.gen object average each individual column and then plot the resulting list. The issue was inserting the information into the matrix because there was no way fast enough to increase to insert information to a new column every sample. From here I went into looking at nested loops in JavaScript, but my issue was not processing the lists but creating the lists in a way that I understood how to process them. I thought about the capture~ object, but saving and then reading a text file sounded like a lot of latency into the process, and it sounded hard to update as rapidly as I needed to. 

All of the research I did online on the issue would relate back to Matlab, which I have never used, but also only mentioned if one is comparing two pieces of recorded audio, not a constant sound, and that is an issue. 

I did a lot of reverse engineering of a patch Jesse gave me, but there was too much excess processing that was unnecessary that in trying to only use the pieces I needed, there was not much left for it to function. That being said, this is basically where I ended up. I have attached a googleDrive .zip file that has a folder with all of my patchers in it for every direction I went and I just never ended up in the right place. It was incredibly frustrating because I feel like I understand the concept and what I want it to do fairly well; however, I also know — for most of my iterations — why they do not work with the understanding that it just doesn’t work that way and there must be some better way to do it. I saw a lot in Jesse’s file that I liked and that started to make sense, but I didn’t understand enough of it early enough to make some sort of adaptation or headway in any way, shape, or form. The one thing that I did accomplish — which was incredibly simple — was to set up bandpass filters with different sample rates to get more FFT information in the lower frequencies, and less in the higher frequencies. 

Some other features I found on the way that I would like to implement if I were to get this working in the future — which I hope I do — would be to take the mouse data from the plot and scale it to show me the frequency and amplitude location of the mouse, and for the program to recommend EQ changes for either a parametric or graphic EQ. The former would be incredibly easy to implement. The latter, however, would be a different story. There would need to be a lot of user input and then trial and error processing by the software to find what frequencies, gain, and Q would be to flatten the response. I would not want it to apply any EQ itself, just provide the information for you to do it.


Assignment 4 – Babylon Colors

Drawing inspiration from the in-class examples, I created a dual bandpass pfft~ resulting in outputting high, mid, and low frequencies. Then getting the amplitude of from each of those outputs, I altered the amplitude of the red, green, and blue of live captured video. In the patch you can change the crossover frequencies – as I showed in the video recording below. In Presentation View, there are three spectroscope~ objects, and when you change the crossover frequencies, it changes the bounds of the view. The video includes Babylon Sisters by Steely Dan, an  ATSC Recording Sample, and Centipede by Knife Party.

Assignment 3: Modern Electroacoustics

The original sound is a lecture about modern electroacoustics. I recorded balloon pops in the back of the Philip Chosky Theater in its Projection Bay and in a 4 story stairwell. I took the second impulse response, added a Wah Wah effect in audacity, and reversed it. Finally, I took a clip from the beginning of Steely Dan’s Babylon Sisters. I really like the flutter from the Projection Bay and I think that Babylon Sisters is interesting because you can hear the song at points clearly, and there are points where you can hear the audio track more clearly — but never perfectly — and I think that back and forth is very interesting.

Assignment 2 — Delay Party!

I have created a patch which can work with any audio file or a live input, but this patch is set up for a stereo environment. The left channel goes into 4 state variable filters: a low pass at 250Hz, a bandpass at 1k, a bandpass at 4k, and a high pass at 8k. The output of these filters goes into live.gain~ objects where the amplitude level at any moment is sent into a sub-patcher that slides the amplitude and then scales it to get to a delay time. Through early iterations, I found that while the sound file is playing, the majority of the file is spent between -30 and -10dB, so that is what I based the scale object on; however, also due to this I would get negative delay times, so I have it take the absolute value of the output of the scale object. This is then sent into a line~ object that smooths it over 50ms. The only difference between the left and right channels of audio is that for the left channel -30dB will give you a delay time of 10ms and -10dB will give you a delay of 1000ms. For the right channel, -30dB will give you 1000ms of delay while -10dB will give you 10ms.

Assignment 1 – Am I Pretty Yet?

Inspired by another project I am working on, I decided to utilize an online beauty tool which is designed to make your photos “more attractive.”  The best one I found for my purposes is There is an option to alter the face based on the gender binary, and to screw with the site more, I selected female. I provided the site with an image, got an “improved image,” and fed that image back in. Below is the original image followed by 20 iterations.

I am so impressed with the results. I was surprised at how quickly it closed up to my nose. Thorugh the process, it asks whether or not the face is male or female, and gives a default based on what the site thinks. I had to correct the site to female, but only for the first 3 iterations. After 3 it began recognizing it as female, until iteration 17, when it began seeing it as male again. The other interesting artifact, is that the process failed to place the watermark in the same place on every image.