Author Archives: ykang1@andrew.cmu.edu

Project 2 Sound Spatialization

For project 2 I wanted to take advantage of the Media Lab’s 8-channel sound system to create an immersive experience for a listener. Using the HOA Library, I generate lissajous patterns in sonic space as well as allow a user to control the exact placement of one of the three sounds.

In order to emphasize the movement of the sounds, I also have uplighting for the loudspeakers, where each sound corresponds to either red, green, or blue and the moving average amplitude of a signal coming out of a speaker dictates the color values for the lights.

The sounds that are played include sounds made from granular synthesis using parameters based on an accelerometer sent through a Raspberry Pi as well as other effects applied to audio files controlled by a Seaboard (done by Ramin Akhavijou and Rob Keller).

This Google Drive folder includes all of the Max Patches for our project.
https://drive.google.com/open?id=1WZH1nr-ARBmZOF9gPrks3_Oh1Q5mJTS8
The top-level patch that incorporates everything with a (somewhat organized) presentation view is main.maxpat. Most of my work (aside from putting together main.maxpat) is in ambisonics.maxpat, which in turn has several subpatches and sub-subpatches and so on.
ambisonics.maxpat is what receives sounds, sound position coordinates, and outputs data to speakers and lights. poly voicecontrol is where the positioning of an individual sound is handled. placement.maxpat calculates the position for a sound (using coord.maxpat) and spatialize.maxpat contains calls to the HOA Library to calculate the signals that should come out of each speaker channel. These are sent to poly light to calculate the light channel value and write it into the appropriate cell of a global matrix. The global matrix is accessed in ambisonics.maxpat to the lights.

 

Here’s a rough video of our project in action

Project 1 – Speech Analysis

For Project 1 I wanted to do something similar to the speaking piano on YouTube by deconstructing and reconstructing speech. In particular, I thought it would be interesting to deconstruct speech in different languages since different languages just sound so fundamentally different. I found a Max object that would do most of the technical work for this and went ahead and tried it out on different vowel sounds and audio clips of singing/speech, using different types of waves for the reconstruction (sine, triangle, square, sawtooth).

Then I loaded them all up in Audacity, time stretched some of them, and pieced together a short clip that includes some of the outputted audio:

My goal here wasn’t to reconstruct the speech into something comprehensible, but rather to explore the sinusoidal makeup of speech.

 

Main patch:

 

Child Patch 1:

 

Child Patch 2:

Assignment 4 – audio visualization discs

I found the mesh distortion technique pretty interesting but I was overall unsatisfied with how… messy it looked. I wanted something a little more coherent so I tried something a little different: distorting several copies of the same object. I chose to use “discs” (really flat cylinders) stacked on top of each other. I scaled them evenly in two directions so they’d stay circular but kept their flatness the same.

Here is a video: (I added the audio separately from the video, so it might not be perfectly in sync) ((also oops I used copyrighted music, hopefully this stays on YouTube))

I will admit I would have liked to figure out a way to make the discs themselves look nicer but we can just enjoy some Moiré patterns instead 🙂

Here is my parent patch:

Here is my child patch:

Assignment 3 (Yijin Kang)

Original Recordings/Sounds

My two impulse responses were recorded in a racquetball court and in the locker room showers in the UC. I had a friend pop the balloon from the corner of the room and I pointed the Zoom recorder toward the center of the room. The racquetball court’s impulse response had a much longer reverb tail but the locker room showers produced a much deeper ring, which I found interesting.

Racquetball Court:

 

Locker room shower:

 

My two non-IR recordings were water from a shower hitting a shower curtain and a recording from freesound.org of glass breaking. For the water recording, I pushed the shower curtain in toward the water stream and recorded it from outside the shower.

Water:

 

Glass:

 

As my source sound, I used the theme from Phantom of the Opera:

Convolution Results

Racquetball Court:

 

Locker room shower:

The Racquetball Court reverb muddles the voices and sounds a bit “thin” compared to the reverb with the locker room shower IR.

Water:

I thought the ending of the piece sounded interesting too, so I threw that in there.

Glass:

The instrumental in the second verse sounded pretty interesting, so I kept that in here.

Other interesting discovery 1

While looking for interesting sounds, I also found that this recording of a crackling fire from freesound produces interesting results with speaking voices and single instruments:

 

Convolved with Alvin Lucier:

Sounds pretty creepy!

 

Convolved with the beginning of Liszt’s La Campanella:

(original audio source: https://www.youtube.com/watch?v=H1Dvg2MxQn8)
This piece does get pretty crazy toward the end, but never this crazy

Other interesting discovery 2

I thought that the convolution with the glass breaking would produce interesting results on other sounds, so I tried convolving that with speaking voice, piano, and percussion.

Convolved with Alvin Lucier:

Sibilants like s’s seem much more pronounced.

Convolved with piano:

The high notes really stand out here.

Convolved with castanets (from https://freesound.org/people/InspectorJ/sounds/411457/)

The castanets almost sound…metallic?

Assignment 2 – Delay with Pitch Shift, Ring Modulation

I started with a simple feedback with delay on audio signals and then tried to find other effects to use on the delayed signal (like the pitch shifted example we saw in class). I tried things like filters and down-sampling, but these weren’t quite as interesting since they’re, for the most part, idempotent, so the 2nd, 3rd, 4th, etc echos wouldn’t sound too different from the first. In the end, I went with pitch shifting + ring modulation.

Here’s an example on a clip from Spongebob (might have to turn volume up a little):

And here’s the patch:

Assignment 1 – Image Rotation

The other day in a computer science class, I came across a funky-looking function that made my head spin trying to understand it. I wanted to try to emulate that feeling by taking an image of that formula and using Python’s image rotation function to rotate the image 41 degrees at a time. I chose 41 degrees because I wanted something that would not divide 360 degrees evenly. Pretty soon, we start to see the effects of resampling the image at every rotation. We also see the edges of the image rounding out as corners are clipped off at each rotation.

This video shows 30 iterations using the nearest pixel’s value for resampling. (The image didn’t seem to change much more after 30 iterations)

Out of curiosity, I also tried playing with the other two possibilities for resampling. This is what the image looked like after 100 iterations of bilinear resampling (linear interpolation in a 2×2 environment):

And this is what it looked like after 100 iterations of bicubic resampling (cubic spline interpolation in a 4×4 environment):

My code: