Author Archives: kpreiser@andrew.cmu.edu

Kinect manipulation — sound synthesis (Project 2)

For my and Dan’s project, we wanted to do something with the Kinect. In particular, we wanted to be able to play a video game with sensor data from the Kinect. However, when we ran into issues with this, we decided we would create an instrument that used Kinect sensor data.

My contribution was the sound synthesis element.

Our setup was as follows– a Kinect connected to a Windows machine with a license for dp.kinect2 sends sensor data through OSC. We use multiple ports for efficiency and simplicity. On my personal laptop, I read this sensor data and perform sound synthesis.

The first step in my sound synthesis was list parsing. For each body part I read (head, left hand, right hand, left foot, right foot), I am given a list representing X, Y, Z, and certainty. I wanted to use the distances between the body parts to create my instrument, so then I made a subpatch to calculate that euclidian distance.

Once I did that, we had to do some manual testing to see the range of distance values that were possible (i.e. standing in front of the Kinect and doing a starfish pose, to get the widest distance between the body parts). Once we did that, we could scale the potential distances (0 to X, X being the max) into usable numbers for sound synthesis.

I wanted the distance from the hand to feet to correspond to pitches of two separate oscillators (left and right), and the distance from the hands to the head to correspond to the loudness of each oscillator. To make the patch more usable, I have the oscillators round to the nearest fifth, instead of just sliding up and down continuously. To do this, I created integers of multiples of 7 (the number of half steps to a fifth).

Jesse also helped me get it so that the motion speed controlled a lowpass filter. To do this, we used the “t” object to store a float. This allows us to compare it to the previous value and subtract the difference. This speed controls the cutoff frequency of the lowpass filter. I then use the distance from the hands to the head to control the resonance of the filter, keeping it between .3 and .8.

I then compress the result and add reverb, cuz why not?

We also decided that the potential pitches for the right hand/leg oscillator should be the same pitches as the left. We considered at one point having each side have different ranges of pitches to allow for more playability, but decided that ease of use and understanding was more important.

If we had more time, it would have been nice to implement some way for the contour of the oscillator to change with some other variable. However, I’m pretty proud of the work we did– our tool is interesting and usable.

The patch:

Project 1 — Bird People

The Second Commandment in Jewish tradition bans drawing/art of anything heavenly or earthly.

Thou shalt not make unto thee any graven image, or any likeness [of any thing] that [is] in heaven above, or that [is] in the earth beneath, or that [is] in the water under the earth.

However, early Israelites wanted to make art! Who doesn’t? To not break the above commandment, many artists drew characters that were a mixture of humans and birds– they were not from this earth, and they were not from the heavens, so they were fair game for early orthodox artists. The most prominent of these examples was the Birds’ Head Haggadah– a Haggadah is a book that a family will read from together during Passover. Other Ashkenazi Hebrew texts drew characters that were neither human nor bird.

Reading this passage during my last passover, I pondered about the fact that bird-human hybrids were used as a way to depict human experiences. I was curious if I could sonically do the same– use bird-human noises to depict human experiences.

For my project, I made a patch that downloads bird noises from online and loads them in for granular synthesis. I then used a combination of the source materials and synthesized bird noises to create an ambient piece, with the intention of capturing the wonders of exploration.

I chose to use the Xeno-Canto database, an online database of bird noises. In the research phase, I found that many unedited bird noises sounded really weird, and almost granular on their own. For example, the woodpecker’s call sounds quite a bit like a 50ms-ish grain repeating over and over. Calls with a “vibrato” sound like short grains repeating. It was very cool to play around with this concept. Because of this, the piece also serves as a challenge– listeners may have trouble knowing whether some clips are doctored or not.

Here is my patch: https://gist.github.com/KenTheWhaleGoddess/dcdb026ae2af56d5d316c1886315f53e

On the top are links to bird sounds. The ULDI object downloads the bird sound when sent the message “download (URL)”. It outputs a message with a 1 if the download is successful, a 0 otherwise. Once the file is downloaded, we go to the appropriate path of the downloaded file and start grain sampling.

For the grain sampling, I used the stutter~ object. I use mathematical manipulation to create an interface to use granular sampling that takes in a grain size and a sample rate, using a mathematical expression I made using expr.

I recorded audio from the patch using the sfrecord~object. Then, I made a ~3 minute ambient piece to demonstrate the capabilities of my patch, and explore how I could manipulate and compose with my chosen source material and patch.

HW4 — randomly scaled frequencies

For this homework, I wanted to explore the possibility of scaling each independent frequency in a pfft instance independently, and I wanted to incorporate camera grabbing– I found our example of using the mean “difference” in camera objects to be an interesting thing to use. To do this, I created a subpatcher that takes in:

-a frequency bin

-a “width” that each randomly generated scale should not deviate from.

-a “random number generator speed” to indicate how often this recalculation should be performed.

My top level patcher looks like this:

As you can see, the “width” can be generated from the jitter objects, in addition to by with a dial. I got these from the Google Drive.

The end result of this patch was quite interesting. Its effect is most audible when humming a single pitch, and hearing it be scaled randomly. When using the patch, you may notice that certain frequencies will randomly be omitted or amplified.

Something I didn’t expect was that there wouldn’t be points where we hear pops because of the difference in scales between two points– I assumed I would need a line~ object. But we learned today in class that pfft accounts for this, so I actually didn’t need to do that.

Here is my patch for anyone interested!

Assignment 3: Everything I do in this class unexpectedly sounds really cool

For this project, I decided to use the source material of Randy Pausch’s Last Lecture. I took the first minute and a half of his lecture, which had lots of interesting material in my opinion. It had applause, laughter, and some key phrases I thought I could emphasize in my composition (i.e. “CT scan,” “this sucks”).

I used two impulse responses. My first was a recording of a staircase in Doherty. I predicted that it would make his voice sound distant and omnipresent.

Two others I tried that ended up not sounding super interesting or good:

A hit on my drum pad– since it’s basically noise, and not much decay, it didn’t have as interesting of an effect as I wanted to.
A sawtooth wave at the same pitch as the Young Thug sample– it makes sense that without sufficient noise across the frequency spectrum, only the sound of the frequency really comes through.

My second was the opening few bars of Audemar by Young Thug. In those bars, he has an interesting way of pronouncing “SLATT,” a slang used by Bloods (“Slime Life All The Time”). I wanted to give Pausch a semi-rhythmic context, and figured that using a few bars as an impulse response would give that kind of output.

Here is how my Audacity mixer ended up. The top track is Randy Pausch’s speech convolved with Young Thug, the second one is with the hallway recording. I faded in and out the second track to emphasize the words in his speech I wanted to come through. The bottom track is just Young Thug’s first “SLATT” slowed down to be unrecognizable, as an opening to the track. I think it sounds dope. The rhythm actually came through, which was unexpected, and there was a static drone throughout the piece, because of the synth in the back of Young Thug’s sample convolving with noise. I thought it ended up with a pretty somber mood, which was my goal.

Assignment 2: Pitch shifts and accidental overtones

Hey all!

For my second assignment, I wanted to create a Max patcher that would create echos, but with specified pitch shifts. Then, I decided to do some experimentation, which led to me accidentally discovering cool ways to create overtones just by taking the same audio and changing the delay for L and R.

Here’s the high level view of my patch.

The input to my subpatcher is:

A mono input using ezdac~
Delay L (ms)
Delay R (ms)
Decay (0-1)
Pitch shift (by ratio)

The output is an audio file with echos. Each echo will be scaled by the decay and pitchshifted by the pitch shift ratio. Each echo will be scaled and pitchshifted again– this means that we will hear multiple successive pitches for each note sung.

My favorite intervals to play with were the m3, M3, and M2. Making the decay scaling value high made the higher pitches more audible, which made it so you could hear more of the higher pitches. With the delay L and R, setting them to be slightly different was a fun way to hear phasing. When phasing, I noticed that noise levels grew while my voice seemed to interfere with itself, which makes sense to me– noise is audible at any time, so it’s understandable that it would compound with time.

Here is the subpatcher:

Here is an audio demo + the patch for anyone interested.

Assignment 1 – Famous Dex likes liquids

For my first assignment, I decided to expand on “I am sitting in a room.” I wanted to compare how a feedback loop would be impacted by placing various drinks between the speaker and mic.

My setup was as follows. I had a cardboard box. In opposite corners, I placed a diaphragm speaker and a simple cardioid mic with a foam filter. I would create my wall of various drinks between the speaker and mic, and close the box. For the input sound, I used a clip of a trap artist named Famous Dex recording ad-libs for a song of his. Why? Cuz it’s funny.

For the liquids, I chose to use Spindrift and Almond milk. Why? Cuz I like those drinks. The containers and method by which they were set up were also different, which I expect to impact the output.

Here were the results of each after 10 iterations.

SPINDRIFT

Audio:

Spectral analysis:

ALMOND MILK

Audio:

Spectral analysis:

My breakdown

With the Spindrift, you can see many spikes in the frequency analysis. It’s also somewhat understandable, and you can hear the original sound to some degree. I think this is because of how I set up the cans of Spindrift– there were three different methods by which sound could pass through– Aluminum, the liquid, and air (there were gaps between the cans). Because of this, many different frequencies were able to find ways through the barrier and stay inside the system. In fact, I had to turn down the amplitude often to avoid clipping, because the system’s amplitude was not stable.

With the almond milk, there was less flexibility for ways the sound could stay in the system. Because of this, there is a spike at a particular frequency (about A#7), which, I guess, is the almond milk frequency. You can hear this in the sound as a cool whistling tone. In contrast with the spindrift, I actually had to turn up the amplitude of the sound, to prevent it from becoming too quiet. I think that may be the reason you can hear some glitches in the sound. Also, since most frequencies are quickly dampened by the system, Dexter’s voice is practically unrecognizable.

I found this to be a super cool project; one variable I did not consider was that the amplitude would be unstable. In retrospect, this totally makes sense mathematically– unless the amplitude was totally unchanged in each iteration, it would exponentially grow or decay. I expect to use these principles in some of my music; feedback systems in live performance, in particular, seems like an area of interest for me.