I built a system that uses two Kinects to track the movements of two dancers, a digital synthesizer that generates sound solely depend on the skeleton data, and a particle pattern visuals that changes based on both the skeleton data and the sound itself.
For the Kinect part, I use the head height of two users, and the distance between their hands, and their body left-right positions. In order to create the best performance, if any of those data from one of the Kinects stops changing, which indicate the person might have moved out of the range of the Kinect, I reset the values of the sliders that is sending midi data to the synthesizer, so that the filters might not be accidentally set to a very low point.
For the synthesizer part, I strip the sound into two parts — one is manipulated by the filters, and one is not to decrease the chance that the sound might be completely turned off during the performance. The synthesizer has 13 presets that allow people to choose from as starting point.
In the particle pattern visuals, the pattern is distorted by the sound, and the size of the pattern is controlled by one of the dancers. Also, depend on where the two dancers are at, the particles will move left or right with the dancers.
For this project, I continued my work on the first project and added a tap for bpm and pose recognition using machine learning and the leap motion controller.
I kept the same overall layout with a video feed being stripped into separate RGB colorplanes and then moving them against each other but instead of having a single looping video I created a playlist of videos which can be switched by making a fist. I also altered the playback speed of the video using the position of the right palm over the sensor.
Instead of using the problematic beat detection object from the first version, I instead built a simple tap for bpm. I did this through a timer and some zl functions.
If I were to continue this further I would look into more interesting parameters to tweak as well as finding some ways to add some more visual diversity.
For this project, I wanted to create a template for use in a performance setting for an upcoming project I am developing which combines electronics and live vocals.
In this case, this patch acts as a template to load a set of audio files into a polybuffer~ and generate an 8 channel ambisonic audio signal using the files which were imported. In addition, a series of parameters have been added which allow for customization both beforehand and live (using a Leap Motion controller) to the output of the patch.
These parameters include the volume for each of the 8 channels, a biquad filter, a svf~ filter, and the positioning of sound sources within three dimensional space (both using generative and manually controlled movement).
The primary benefit to this template is that it auto-generates a multi-channel audio playback object and automatically connects it into the objects from the hoa object library so that the primary focus of any project using this template is on the customization of parameters rather than building an ambisonic patch from the ground up. It is possible that, using the current form of the patch, you can generate a sound installation for instant playback using only a handful of audio files (within a particular set of bounds) and various parameters of the sound as it is played live.
Given more time, I hope to further revise this patch so that it is more flexible and allows more complex ambisonic instillation to be automatically generated (such as up to the 64 channels currently supported by the Higher Order Ambisonics library).
Patch Available Below (Requires Higher Order Ambisonics Library and Leap Motion Object by Jules Françoise):
For the final project I decided to further explore the connection between motion and sound. I incorporated data from the Myo armband into a music synthesizer that used several techniques I have learned from this class.
The synthesizer is composed of two main parts: the motion data reading section and the music control section. I used an online myo-osc communication application (https://github.com/samyk/myo-osc) and udp messaging to read the armband data. I am able to obtain normalized quaternion metrics as well as several gesture readings. These data laid a solid foundation for a stable translation from motion to sound.
I selected pitch, playback speed, timbre and reverberation as the manipulation parameters. I downloaded music as separate instrument stems so that I can play with the parameters on individual track without interfering with the overall music flow. After many trials, I eventually had the following mapping relationships:
- The up/down motion of the arm will change the pitch of the timbani instrument.
- The left/right motion of the arm will change the playback speed of both timbani and percussion part of the music.
- The fist/rest gesture will switch between piano-based and bass-based core melody.
- The rotation motion of the arm will change the reverberation delay time of the piano melody.
I recorded a section of the generated music, which is shown below:
The code for the project is as follows:
For my second project, I decided to continue using the leap motion device, but for visual purposes. I decided to create an object generator. The object’s position, size and color are all manipulated through gestures and positions of the hands. I was able to incorporate topics we learnt in class such as Machine Learning, Open GL, etc in my project as well.
Here is a short demonstration:
Like project 1, I created my main patch from scratch:
I modified the visual subpath of the leap motion help file:
This is modified patch of the machine learning sam starter and training patch:
For the final project, I decided to further explore Max MSP’s self-generating music project, a step above of what I created for Project 1. For this project, 8 different designed sounds are ready. 5 are main sounds while 3 acts as special effects. The patch almost acts as a sequencer, with inputs of tempo and ‘beats per bar’. Each bar, a new sound is triggered completely randomly. However, both the frequency and volume of the sound is from analyzing the user’s input through the piano keyboard and other settings. The user is also able to change the sound design of 5 of the 8 sounds through graphs. The piano keyboard also acts as a slider, as both the frequency and volume is set based on where the user clicks it. Other sliders for the 7 different sounds indicate the octave possibility range. From then on, the 5 main sounds are selected randomly. The 3 FX sounds are played also due to chance, yet this chance is a result within the subpatch. The sounds are processed through reverb and delay effects. Furthermore, a stuttering effect is also available, which splits each bar up into 16 distinct ‘steps’ (inspiration from Autechre).
I originally wanted to due a music generative project based off of possibility and an input from the mic. But after researching online and especially finding out about the music group Autechre I changed my mind. I mainly got my inspiration from their patches. Sound designs were learnt both through the youtube DeliciousMaxTutorials and http://sounddesignwithmax.blogspot.com/. Reference for the reverb subpatch: taken from https://cycling74.com/forums/reverb-in-max-msp.
Here is a recording sample of the piece being played:
Code as follows:
In this project, I explored the concept of combining semi-generated music with noise. The initial signal in the piece includes midi chord and head position. The patch further combines these two signals, which trigger other noises that contribute to the final piece.
Initially, I planned to work with finished midi sequences but ended up with using only chords since the goal of the project is to generate both noise and midi notes. These generated midi notes vary in velocity, which is controlled by vertical head angle. The facial landmark positions are generated through pre-processed python script using dlib, easy reading here (https://www.pyimagesearch.com/2017/04/03/facial-landmarks-dlib-opencv-python/). the output looks like the figure below. The arrays of x, y coordinates are sent via UDP through python using the OSC library(https://github.com/ptone/pyosc), which makes the process easier(the native socket library in python does not work very well with Max).
The first part of the patch includes parameter-processing and calculations. After the facial landmark identification script is loaded, you can press A in max to set up the reference coordinates on which further processing and calculation will be based. Part of the patch is presented in the figure below. For all parameters, the average over 4 incoming values rather than original value is used for more stable implementation. Note that all values/distance calculated are not the values you will get measuring your face with a ruler. The calculations are made to be as simplified as possible and no normalizations are implemented. The output represents scales, rather than the exact values.
The second part is note-processing: In the current version, the patch is comfortable with 3-note chords. I will keep on working on the patch so that it can be more flexible in dealing with chords.
You can control the following sound effects using the following parameters:
- Landmark 28, Horizontal Position, pitch shift in background noise with granular synthesis, triggers crow noises effect. I implemented manual fade in and fade out for the sound sample for better ambiance:
- Landmark 28 + 1, Horizontal angle, triggers gunfire background noise with granular synthesis, greater variance corresponds to greater random pitch rate. Since calculating variance can be a little bit difficult, I used another value of similar nature(picking the first and the last value from the bucket and calculate their distance) to work with the concept. (My granular synthesis implementation is this project is relatively native since the goal is to produce sound effects. )
- Landmark 28 + 20, Eyebrow-raise, triggers an argument sound effect.
- Landmark 48 + 44, Eye-close, stop the generated notes from playing.
- landmark 9 + 58. Vertical head angle, changes note velocity. Raising your head produces louder notes, assuming that you start by looking down at the screen.
- Landmark 49 + 55. Smile/Smirk, triggers a wicked laughter. The laughter is transformed into multiple pieces of different pitches using phasor so that you can hear the smile from multiple people(witch and wizards?). A custom-pan is implemented so that you can pan the laughter by moving your head sideways. In the full demo, I forgot to move my head to the direction to which the pitch shifted voices can be triggered, here is a sample for this sound effect alone.
- There is also a background thunder effect, triggered by low-pitch notes generated in the note-processing patch.
You can also make changes manually while performing if you are not happy with the sound.
Due to my computer’s inability to process all the data while recording both audio and video, only the audio file is included.
For my final project, I created what I’m calling a Small Production Environment, or SPE. Yes, it’s a bad name.
The SPE consists of three parts: the first being the subtractive synth from my last project, with some quality of life and functionality improvements. This serves as the lead of the SPE.
The second is a series of four probabilistic sequencers. This gives the SPE the ability to play four separate samples with probabilities specified for each sixteenth note in a four note measure. This serves as the rhythm of the SPE.
Finally, the third part is an automated bass line. This will play a sample at a regular (user-defined) interval. It also detects the key being played in by the lead and shifts the sample accordingly to match.
It also contains equalization equipment for the bass & drum (jointly), as well as for the lead. In addition, many controls are alterable via MIDI keyboard knobs. A demonstration of the SPE is below.
The code for the main section of the patch can be found here. Its pfft~ subpatch is here.
The embedded sequencer can be found here.
The embedded synth can be found here. Its poly~ subpatch is here.
Thanks to V.J. Manzo for the Modal Analysis library, An0va for the bass guitar samples, Roland for making the 808 (and whoever first extracted the samples I downloaded), and Jesse for his help with the probabilistic sequencers.
As a pretty heavy music listener, I have always wondered to myself if it would be possible to mix a few songs together and create a mashup of my own. After eagerly surfing the web for an app that would let me do just the thing, I quickly realized that using a mouse and keyboard is not the proper interface to work with music. This is exactly why DJ’s use expensive instruments with knobs and dials so that they can quickly achieve the effect they are going for. For my final project, I made an Air-DJ application in Max so that you can convolve your music in a variety of ways using your hands and never touching the mouse or keyboard. Using a Leap Motion sensor, I used various different gestures to control different aspects of a song.
After selecting a song to play, you can use your left hand to add beats. You can add 3 different types of beats by either moving your hand forward, backward, or to the left. Lifting your hand up and down will change the volume/gain of the beat.
Your right hand controls the main track. Again, lifting it up and down will control the volume/gain of the song. With a pinch of your fingers, you can decrease the cut-off frequency of a low pass filter. I also implemented a phase multiplier when you move your right hand towards and away from the screen (on the z-axis). Finally, moving your right hand sideways will increase an incorporated delay time.
Here are a few screenshots of the patch:
And here is the video of the whole thing!
All the important files are below:
Google Drive link containing all files: https://drive.google.com/open?id=1FmMiDLyB4gIbOK6bx0KgIbESSKyNBcA1
Github Gist: https://gist.github.com/anonymous/4570d6ae97e13fe29337a57a97fb81e5
For this past semester, I have been conducting a research project under Prof. Susan Finger to install projection systems around the IDeATe Hunt Basement to create a platform for students in the animation, game design, and intelligent environment minors to publicly display their work. Therefore, my projects for Twisted Signals revolved around creating demos for specifically the interactive projection system using Max. My first project, a virtual ball pit, was a good exercise in learning on how to use the Kinect but was not really a conceptually heavy demo. Therefore, for my second project, I wanted to make a system that would actually teach the users something.
The concept that I settled on was to make a system that allowed users to interact with the Hunt Swiss Poster collection, an extensive set of extraordinary Swiss design posters that are housed in the Hunt Library which very students know exist. Originally, I had planned on using the Kinect to allow users to “draw something” using a colored depth map that would then get processed to display the closest Swiss design poster. However, in my early protoyping, it was starting to get apparent that the interaction was not as obvious as it could be, which was leading to a weaker installation. Moreover, as I have had to borrow all of my equipment from IDeATe for every project, I ran into the issue that every Kinect and my specific computer was checked out for the time span that I needed to work on this project. Therefore, I had to pivot.
While planning the projection installation, we were hit with the news that the Kinect was no longer going to be produced. As I was forced to work without a Kinect anyway, I decided to work on creating an interesting interaction with just an RGB camera which thankfully will probably always be produced. Additionally, I realized that, although being a far more difficult path, the best possible way for users to interact with these Swiss posters was to be a literal part of them, which would mean every single poster would have to be designed uniquely. However, this direction would also result in an avenue where several students could choose to participate in this project if they are lacking in their ideas for projects.
Therefore, for my Project 2, I created two different Swiss poster exhibits as well as a very simple UI that an IDeATe staff member would use when turning on the projection system each morning. Each exhibit has an interaction display that mimics a Swiss poster design that is placed next to the original Swiss poster, some information about the poster, and some information about the project.
Gist of Code: