Category Archives: Assignments

Project 1: Baad

The concept for this project was inspired by Rational Acoustic’s Smaart, which is a software built to assist in normalizing loudspeaker systems. From the name, Baad, you might be able to tell how it went. From the beginning, I have had the concept, but the execution just never got there. The main feature that I was going for was the ability to view a spectroscope showing the difference between the output from the program and the input coming into the program from a microphone after having gone through the loudspeaker system. In order to reach this a few other points needed to be met. First, calculating the delay time from the point the audio leaves the software to the time it returns via the sound system. Second, averaging the amplitudes of individual frequencies over some period of time to both smooth the information and to calculate the similarities between the values giving what is known as the “confidence.” Finally, doing octave smoothing to make the amplitudes more readable for the purpose of applying an EQ to the information. 

Here is where the issues came in. In its most simple state, what I am looking for is the spectral analysis of the impulse response of the audio system. I began with putting the direct and received signals into a pfft~, doing a cartopol~ and subtracting the amplitudes, but this did not give me anything close to what I was looking for. After a few different variations to this, what I found gave me what I wanted was dividing the real numbers of both and the imaginary numbers of both and then put that into an ifft~ to give me the impulse response signal. Great. I’ve got that part. Now I need to plot it. I found the vectral~ object which is meant to plot FFT’s, great. The issue is that when I plot it, it doesn’t give me the information that I want. At this point, I don’t think I ever fully accomplished the first step. Once I got here I played around with the console’s EQ to see if my plot would react and it did not. On the way to the vectral~ object I ended up in jitter world for a while, seeing if I could create a matrix with 512 columns and 83 rows, then using a jit.gen object average each individual column and then plot the resulting list. The issue was inserting the information into the matrix because there was no way fast enough to increase to insert information to a new column every sample. From here I went into looking at nested loops in JavaScript, but my issue was not processing the lists but creating the lists in a way that I understood how to process them. I thought about the capture~ object, but saving and then reading a text file sounded like a lot of latency into the process, and it sounded hard to update as rapidly as I needed to. 

All of the research I did online on the issue would relate back to Matlab, which I have never used, but also only mentioned if one is comparing two pieces of recorded audio, not a constant sound, and that is an issue. 

I did a lot of reverse engineering of a patch Jesse gave me, but there was too much excess processing that was unnecessary that in trying to only use the pieces I needed, there was not much left for it to function. That being said, this is basically where I ended up. I have attached a googleDrive .zip file that has a folder with all of my patchers in it for every direction I went and I just never ended up in the right place. It was incredibly frustrating because I feel like I understand the concept and what I want it to do fairly well; however, I also know — for most of my iterations — why they do not work with the understanding that it just doesn’t work that way and there must be some better way to do it. I saw a lot in Jesse’s file that I liked and that started to make sense, but I didn’t understand enough of it early enough to make some sort of adaptation or headway in any way, shape, or form. The one thing that I did accomplish — which was incredibly simple — was to set up bandpass filters with different sample rates to get more FFT information in the lower frequencies, and less in the higher frequencies. 

Some other features I found on the way that I would like to implement if I were to get this working in the future — which I hope I do — would be to take the mouse data from the plot and scale it to show me the frequency and amplitude location of the mouse, and for the program to recommend EQ changes for either a parametric or graphic EQ. The former would be incredibly easy to implement. The latter, however, would be a different story. There would need to be a lot of user input and then trial and error processing by the software to find what frequencies, gain, and Q would be to flatten the response. I would not want it to apply any EQ itself, just provide the information for you to do it.


Project 1: KeyboardJohnny

For a while, I’ve wanted to get some more experience in Javascript. Whether it’s because I see money in web development or whether it’s because I am drawn to masochistic scoping and global variable practices… I’m not quite sure. Regardless, I saw this project as a good opportunity to flex my fledgling js muscles and make some dots fly around.

The product is an audio visualizer comprised of two visual systems: a point cloud (bunch of dots floating around according to a noise function generator, connecting to one another with line segments whenever they are within a certain distance); and a particle generator, called a particle jet because I want it to be. These systems are, with the exception of basis function generator and matrix to calculate the point cloud positions, entirely contained within Javascript classes. Heavy thanks to Amazing Max Stuff for teaching me how to make these systems.

Informing these two systems is an amalgamation of concepts we’ve covered in class. Starting with an audio signal, I used cascade~ objects to filter it into two frequencies, one roughly representing the bass of the song and the other supposedly representing the vocals but, in reality, just vaguely representing the treble portion of the song. Once separated, I fed the two signals into FFTs, then packed the bins into a matrix and used the average values to calculate the parameters for the point cloud (radius and line drawing threshold) and the particle jet (rate of movement/emission and color). The point cloud grows whenever there’s a bass kick and the particle jet spins in circles around it – it’s all quite fun. Here’s the gist!

E questo è tutto! Because there are a bunch of classes and scripts that go along with my patch, I’ve uploaded the whole thing to a github repository here – but beware!! There are a bunch of values that are woefully hardcoded to make the visual match Blood Brother by Zed’s Dead, DISKORD, and Reija Lee, and no shiny GUI to change them as of yet. But I did include the audio file in the repository (I hope that’s not illegal), so there’s that.

And finally, here’s my dots dancing to the aforementioned song! Please excuse the audio quality, it’s early in the morning.

Project 1: Buy U a Drank

For this project I wanted to dive into something I’ve been curious about for years now: auto-tune. I was never quite sure how the process exactly was achieved, so I aimed to mimic it to the best of my abilities and potentially make it more suitable for live work.

That being the case, I traversed through a plethora of pitch-detecting and pitch-shifting max objects, be it pitch~, sigmund~, gizmo~, pitchshift~,  fzero~, etc. and I finally came upon retune~, and found it to be sufficient for what I was aiming to do. It allowed me to get a decent estimate of incoming frequencies and shift them to match certain scales. In order to do so, I had to map out intervals in cents and create lists that could be accessed easily by the program so that the latency wouldn’t get too bad.

I also decided to add on a couple different features to this core idea. I allowed for the autotuning of a file or of the microphone input on the computer. In addition, I plugged in my assignment 4 patch to allow me to record and manipulate some newly autotuned audio. Finally, I added a sonogram to highlight the rigidity of the processed audio and two nsliders displaying the original note of the signal next to the corrected note after processing.

Below is a recorded demonstration of the autotuning effects (the manipulation of recorded data within the patch will be demonstrated in class). Apologies for not actually attempting to sing Buy U a Drank, I figured even with auto-tune I can’t really be T-Pain…

And to go with it, here’s the patch!

Project 1 : Particles

For project 1, I have decided to create a system of particles that are able to be moved and manipulated. In a way, it works as an extension of assignment 4, where a system that does processing in the frequency domain using [pfft~] was created to allow the visual elements to react to auditory ones.

In short, the project functions as a world of particles that are constantly in motion. In creating their movements, various elements such as time, position, mass, and velocity, were taken into consideration.

Initially, the particles had been designed to be attracted to, and completely dependent the position of the mouse. However, always having been fascinated by the degree of control I have and lack in the process of creation, I decided to construct a project that not only reacts to my decisions, but also contains an element that is incontrollable – one with a sense of agency.

While the Y values (vertical movements), are controlled by the mouse position, the X values (horizontal movements) are designed to react to any audio file added in a patch similar to the one used for assignment 4.

Here are some of the still shots that were created.



Project 1: These sounds look nice.

For my first project, I created a Max patch which can take an input video and synthesize a corresponding audio file which has a wave form representation that looks like the video.

To convert a video to a waveform representation, a few things must happen: First, the video needs to be simplified. I use edge detection for that. The edge detected video needs to be converted to matrices of sine values. A scope will take these values and plot them to x and y coordinates in a signal. Since our video is edge detected, we need only look for “visible points” and then determine what their sine values are. These values correspond to their position in the matrix. More clearly, the top-right of the screen is x=1, y=1 and the bottom-left is x=-1,y=-1. Once we have our list of values, we need to align them so that the output looks “correct”. This is done in python, as it allows for more complex manipulations. The aligned matrix is then written to a jxf, for later use in max. That matrix represents one frame of our video, as audio.

Interfacing Max with Python required a bit of creativity. I wound up using OSC to send messages between Python and Max, with most real data being sent in the form of matrices saved to jxf files. The exception to this is the patches playback function, which is python sending many read values to Max under very strict timing.

Rendering the video to audio takes a long time. Around one second of 24fps video takes 1 minute to render. I’ve included a short video and it’s corresponding audio and representation. My project Zip also includes a scope patch to allow for dependency-free viewing on any computer with Max.

The patch itself has several requirements, with the primary ones being Python-OSC, Python 3, and xray.jit.sift. These are either included or else explained in the README of my project.



My project can be downloaded here:


Project 1 – Speech Analysis

For Project 1 I wanted to do something similar to the speaking piano on YouTube by deconstructing and reconstructing speech. In particular, I thought it would be interesting to deconstruct speech in different languages since different languages just sound so fundamentally different. I found a Max object that would do most of the technical work for this and went ahead and tried it out on different vowel sounds and audio clips of singing/speech, using different types of waves for the reconstruction (sine, triangle, square, sawtooth).

Then I loaded them all up in Audacity, time stretched some of them, and pieced together a short clip that includes some of the outputted audio:

My goal here wasn’t to reconstruct the speech into something comprehensible, but rather to explore the sinusoidal makeup of speech.


Main patch:


Child Patch 1:


Child Patch 2:

Project 1: Video Sonification

For this project, I decided to take camera input and use different characteristics of the video to generate sound. I split the input into 4 matrices, and used different characteristics of each of the 4 matrices and mapped them to 4 different sounds. The characteristics I used included the hue, saturation, and lightness values as well as the RGB values of the images. I also wanted to add a visual component, so I used jit.sobel for edge detection on all 4 parts, with 2 of the parts remaining as the output of the edge detection, and the other 2 parts having the edge detection output blended with the video with the brightness and saturation varying based on the original video input. I then stitched the parts back together to get a cool distorted image to go along with the eerie sound that is generated.

Here is my patch:

Here is a video that shows the patch working:

Project 1 – Do you like my car?

For my project I looked into how Max can benefit the Unity game engine as a separate sound engine. I did this through using an OSC (Open Sound Control) plugin on Max and a respective script in Unity.

The goal of this project was to show that input in Unity’s game could control audio output in Max. The game is a demo in which the player drives a car. I’ll detail each of the 3 control methods below…


The speed of the vehicle directly correlates to the playback rate of the audio. This is done in Max with the groove~ object.


Turning the vehicle will pan the audio in the respective direction. The horizontal input from the controller is directly given to Max and then handled with the pan2S object.

3:Song Select

By clicking on the vehicle, the user can switch through songs. This translated to a gate~ in max that changes the route of the signal produced by Speed. With the current system Max remembers where you left off in a song that you switched off of. I find it as a fun comedic effect.

Here’s an image showing where Unity is sending information to Max:

Here is a link to a playable demo (download the whole folder):

And here is the accompanying patch:

<pre><code> ———-begin_max5_patcher———- 2091.3oc6a08aiiaD+4j+JDD5id0xY32G5Cs.scQQQ+.WVzWNbXghshWs0Vx UVdub8vs+s2gTVN1wxxz1JduEHODiHRZMjyueyvYFR+K2dS78kOlsLN56h9g nat4Wt8la7M4Z3l0OeS77zGGOKcoeXwEqleeVU7nltnmJWUOKq12Itt07I9g Vd+mdiT2NzEoUoyypyp9PVQ58yxbCgstul2Q8OuHqYpDGOJJ99zhowQ+3lud 83OlWL8CUYiqaFkPkvzJKH0BlUozV8nHjyRTRKminVxsLEJGEIYILpK5yMuN ZhmWzNuAWa+5s259XTn5grehVesKt5rG8Sp3oUkkeN6KQ2UVLUD8GlUVtHBB VagazV6oQVlOsHclSur9+5Sy.BYhzoZrRoToDjZfV+FchEPfY4ZRo.ftesBe ciKS+b1jOPSO50+gz55p76WU2PZtYixgTWoKyGWupH2gZeWjPrFbuI9gxYyJ +ooyJuOcVc17EkaA8tdqlmVTOtrxM+yKK1oWmBLqXRyxxCiwyWtYs2z+x5zp 5CNh4kS75vlIXbaykU4SyI83rrho0eb6ucc93+y1uf1QtYpCNcVauK9X5xr9 95KxI34.qt+6pzY40+bGSu574YKqqxnu65WdzO56xSSeAHq7SmrpFHxJfIBf gBsggZtvyN4fHAMRMqk+9Ji8UF6tLV7zYr3vvX01DTvkFNsmCXsHXIBqDRzR CpYnhZGLu5i8UFqSjqd3grp1.Bd26dWx7E7fYrf4fL1GlUlVGTbRbIEM.nDZ qf3pdhIx0IFh7JYswHQ86ophCFjz5FGWNqrpELzbMJEVFXULsQLhZRhVAw6I wwjFC5ZB0BIyZHYgfTQFKvVB49o67F4FAGd5sNJvl19M9PYQ8x7+mWQQaivd Y.Sdz+He5GqK+KmFfpGB.kh784.JS8b.k7R8JfdB.JF8mx9T5+d0Igm3KCbB ZdhA4JtwBJFoiaPT8qH5FDcb474YE06Ao+927ln+9e7u8mi9quO5t2+O+WcC kr8gRgM9vXjUzjq4NITYSTZFmSMJrREEppKIKiGirGCi1VC31q6Lz.ztWKSm lsmFfkz8ZF5XMeX2Q8RYQtWcfbNo.T.GMcqNTdePhCpMvNzFb0PZguL58qpJ BlDvM8PBTBQhhwLbFYIAn2sKj.RtgaAvoE7M5V.8VtgtLSwsMVXFuMIYIZTs VlMVtvNVyqMpFNsUUOZqNnOb04QeTfIADjWHAmKaJGApfDEHEfVPqUk2KHW1 uhbPoIKos.BmlH6glflFeEfEUnxJ4.YBvIjxXXj2X9lkm4pUVppnueRY3vp3 7f0NKHWmVH7qHvdvUdW3JuOyelLAUj4H8OqgvtWcxugs+uawjv4I34wSHS+8 2MEXTHO.GLTCRo2AKoJwqHQ4fK8tHJPODEoRlX4BPvYaJdU2LEw2tLER+jE8 16VjkMI5see5j7xn2517H+PtQEcnEYw8cB.a9qOxTWJaPSJ6cnRtF4GoXLhe 6psWMYAshyx+bFs2oBB1BEsmmEZmJUxpkb8Ivm1f1feyxemlVm8kHQvTUzDd sCGch0Sry8M6HdHfBaR1m9FGxcBNb.QcQzjmIQyvddjfcU5+iER8.tOvhzB7 tvy+WLLkTtK8fniJJeLeXmCC3gYkz6H7kLefNRWeUo8qe0okpvtGj4U7zcIf 7KgaR.GiZbhoH0kgApuZK9YkoSbkbfj1oTggyzw.zUEW5LFQ84DDksoZXsaE MJvl5opXWxa7k2nUfWSi1tAO2YN7bv6JZ7l83hpne2CPzaoOwvSxQetI4n5P GnY6GXp5quOc3p5SuaUiU5b3Yjn1vQFGveC6aGtHe6ct9Qju+MW5J5deR53u Db5up3SKTFxGsE4LlXqCnhyFdZe.Wa.3ju1.CTHdB89Q4yoPYd8VC75sFnuy kDhpxqsrS6XleYt1.ctENHe8flO6MdkW28cwuhgkQIVmMK7Br12owJkXBPtQ A.2bZJ.sehVQydsP.aUGP2QQxsTfEZWBitBIAz9SzGfjCVCJPv1Ous0OU6yj huAeH21stansXRm.anh6Rkc7NfXza0rqI0Ns.6REcuObSIyFs0GaOpEUkKJq 138JQ9hiKBVO3Bm6z2LoBj35KxoykLYswInRaQ65iMAnXcHJmg.Htwsumo4F HYUZJFglK7vPCJlmpvH4NoQaBamNlrylNHnn1+Kq1VBf2eC7bpvWY3CNQyJT pbtGPtyPowPC8maIHL9abDYC0djlCMhI22lIjlFVyH+fhmkW77emDdEnq8cA mkkqpF25nt8m.QzSxZBE8.EzRqv9gmJubzgpDPvRJ.Aw0CffDgrjb4Tt0flm OYQYdQ8ZsHG8+BE14Fj.TTNjSZi0RbPZ6HMW3BunieKCcLxKcIEhxyUEksFz Ka8eBdligB6vkJoPDjrOTmblHedNPBzW.OJLD4d2ZhlfJunIM+T3pWljTgXn OHRJDjPNDV57PHWxgPPTbEQ3wHWlgRRvwjDenjzQMXfAPRty483RhMDRxDhe xgRP7iYMoGJIcLpGpFJIcLpGhCgjzWruZ2Fu6ksHsw0gbV665Rm1pKeZiTNH 68yISbv8X7ccoS6PBd.GDqhP77qFJAAAHnKMnBjGhtaPbRhWL4xXRz68K+Ra VysXa+v9Ay.1KmnAgRztTXABAVfg.V.Hz7i1cM0jXV5hEeNqZ45Q6EBke7mZ hK2Lx+XdQyi9jLiqx9bd6384hFmVQYBWSoAuppofzOpZp4uOm0phU4qq4Es7 HQ5y8tHcd1xEoiWWAaJE8a+0a++.YFK0w ———–end_max5_patcher———– </code></pre>

Project 1: Twisted Rhythm Game

Hello! For my first project this semester, I decided to make a spooky, Halloween themed rhythm game. I built it using LeapMotion. The main object of the game is to score as little or as many points as possible. The player can aggregate points by positioning the pumpkins above the skulls right before the skulls “pop” and fade into the background. The pumpkins are controlled by a LeapMotion hand sensor that in this context tracks the palm position of each hand. The player can use both hands to guide of the pumpkins separately. There are a couple easter eggs hidden in the game that trigger during certain sections of the music that is streamed, and when the player exceeds a certain amount of points. Below is a video in which I demo the game:

Below are the two main max patches I wrote to create the game:

Here is the second patch:

Below is a link to a zip file with all the relevant files for this project:

Project 1- Brightness

For this project, I used the data getting from the brightness of camera and specified that into four different sections. This patch is going to be used as an installation in a gallery. It takes the natural light from outside and reflects it into audio and video. I used the data for triggering or controlling different things such as modifying the speed of the music, turning on/off the music, changing the color, etc.

The brightness of camera gives me numbers. Then, I separated them into four sections in order to control different parts of the project. The first section generates some sound with different harmony. When the numbers transit to the second section, the color will be changed based on numerical modifications. The third section triggers the electronic music and its speed will be modified based on the numbers gaining from the brightness.