ejhicks@andrew.cmu.edu – 18-090, Fall 2019

ejhicks – Project 2: Color Chords

Mon, 09 Dec 2019 14:13:53 +0000

For my second project, I was inspired by the idea of using color as a means of producing music (or, at least, the illusion of music). Videos like Colored Virtual Piano (https://www.youtube.com/watch?v=8FcexZUTITc) encouraged me to examine the possibility of creating a program that could analyze the colors in front of it and create a soundscape based upon what it could interpret.

Originally, my plan was to create a functioning keyboard, where each note was associated with a single point on a grid, and the placement of a color would reproduce that note. However, while working on the project and discovering that MAX had the tendency to read all colors before it, rather than simply isolating colors in live feeds, I was inspired to alter my program to allow for a shifting soundscape, where moving colors produced a sequence of notes and chords based on horizontal position and variety of color.

As such, I was fortunate enough to discover a patcher designed by Matt Westerwick which could associate a table’s output with a distinct chord or piano note. With this asset, I could design the program to detect one of three colors (RGB), assign its position in the video to a set of coordinates which would be assigned to a unique position in a table, and then translate the tabular position to a shifting series of notes based on horizontal position.

The project heavily employs the findbounds function to detect (and then pack into a table) the coordinates of the colors (as mentioned above) relative to the left, right, top, and bottom of the video input. When experimentation proved that stagnant colors would quickly create unpleasant loops of sound, the change function was employed in order to filter out repetitions of input from said stagnant colors.

Upon final review of the project, I wish I had discovered a means of more clearly differentiating the sound output between Red and Blue; I wished for them both to be piano-based outputs, but sometimes this caused the sound output to lack unique qualities when comparing the colors. However, the soundscape is still an interesting canvas to explore with this project, and I am quite satisfied with the final result.

An screenshot of the project’s primary code is below, and the link to the google drive is as follows: https://drive.google.com/drive/u/1/folders/1L8qZdDVdyAyvA-vile3aCzIXV5FNDoZ5.

]]>

ejhicks – Project 1: Distortion Tracker

Wed, 06 Nov 2019 14:13:30 +0000

For this project, my initial idea was based around the concept of audio recognition. Many audio tracks are considered iconic, and can be identified even with the source heavily distorted or edited. For example, “We Will Rock You” boasts the unmistakable “stomp stomp clap” sequence throughout the entire song; if an individual were to simply stomp the sidewalk, even with the rest of the track removed and the sequence off-beat, the tune would be recognizable.

As such, my project focused upon finding the breaking point for iconic tunes through means of distortion, and creating a tool where a user can input a song to be distorted and have said “breaking point” numerically identified. The distortion method would need to be consistent, and as such the tool I developed uses a vinyl track to initially convolve the inserted audio track (a subtle change, but enough to begin the distortion process), and then allows the user to alter a degradation slider as they see fit to alter the track’s sampling-rate ratio. The level of degradation is actively updated and recorded, such that when the listener no longer recognizes the inserted track, the level of sampling-rate distortion can be recorded and tested against other tracks.

In order to provide an example, three iconic tunes were tested in the tool; the aforementioned “We Will Rock You”, the Pink Panther theme, and “Country Roads”. Both the original and distorted versions can be found below. Of the three, ” We Will Rock You” required the heaviest distortion, with a sampling-rate ratio of only 0.005 before the track was thoroughly disguised. Pink Panther required a 0.011, and Country Roads required a comparatively generous 0.016 before it was thoroughly distorted.

Although this tool’s measurement is intrinsically subjective (testing showed recognition differed between listeners), the tool helps measure how recognizable a track’s underlying audio cues are, and how much distortion is required to thoroughly disguise said cues.

https://drive.google.com/drive/u/1/folders/1sc-1M4PufMW2tQM1ZzN6QSgnHDCLiqw7

]]>

ejhicks – Assignment 4: Cacophony of Gold

Wed, 16 Oct 2019 13:11:32 +0000

Ennio Morricone’s The Ecstasy of Gold is an extremely recognizable song; it is perhaps the most recognizable spaghetti western theme (with the admittedly notable exception of The Good, The Bad, and The Ugly theme). For those who have listened to the track, one would notice that the sound and complication of Gold is fairly low and simplistic at the beginning, but ramps up to a great soaring orchestra within minutes. Using pfft processing, I attempted to create a visualization of said acceleration, creating a row of floating cubes moving back in forth in a straight line that pulsate and adjust in size to correlate with the provided music. As expected at the beginning of the tune, the cubes pulse noticeably but gently, but as the track continues, the cubes begin to grow in size and pulse so rapidly that it becomes impossible to identify them as cubes, much less determine the number of them and their direction. The visualization of the dynamic audio helps provide a greater understanding and appreciation for the controlled swell of the music, as such an uptick in complication could have very easily created a total cacophony.

https://drive.google.com/drive/folders/1l9DxgPIlTWCzOZr4SKD9mZ0BFgsmcb1X

]]>

ejhicks – Assignment 3: Convolving the Moon

Wed, 02 Oct 2019 13:18:42 +0000

For Assignment 3, I chose Frank Sinatra’s “Fly Me to the Moon” to be the original audio to be convolved; the song is easily recognizable to most listeners and has some nice audio variety, which made it a tempting choice to distort beyond all recognition.

The four recordings chosen to pair alongside Sinatra were relatively diverse. One was of a balloon popping in an empty bathroom, allowing for a soft echo effect that would hopefully cause some light distortion. This recording and its effect on Sinatra, like all recordings referenced here, can be accessed in the link below. The secondary recording made personally was that of a plastic bottle falling down a stairway: this was chosen to add some variety, as the two balloon recordings initially made did not result in particularly distinct outcomes. The bottle clip allowed for much more variety in the sound output.

For the final two sound effects, they were chosen for both variety and for personal amusement. The first is Ben Kenobi from A New Hope exclaiming “That’s no moon. It’s a space station.” Using speech allowed for a heavily distorted output, as well as being amusingly fitting for the subject matter. To stay on theme, the final recording was sound clip from the same movie; the explosion sound effect when the planet Alderaan is destroyed by the Death Star. The sound clips for both of these sequences and their effects on the original recording can be found below.

https://drive.google.com/drive/u/1/folders/1bEygnqG8CQvQY2v1ZjbSXH5oyw0zpu6H

]]>

Ejhicks – Assignment 2: Time-Shifting Film Processing

Wed, 18 Sep 2019 13:40:32 +0000

The goal of this assignment was to create a signal processing system that employs time shifting in some way. While tooling around with ideas for the project, I stumbled across a fairly interesting dynamic between various means of video processing. Time-shifting can result in major differences depending on what in the video you are shifting, and this assignment attempts to compare two such differences. One of the two videos (both simultaneously running Revenge of the Sith for the sake of comparison) only updates one frame per second. This gives the film a slide-show effect, and is usually considered a frustrating and improper way to watch any film. The second of the two videos employs the jit.slide function; while this film does run at roughly 30 FPS, the slide value output only updates at a fraction of the speed the inputs are arriving, and as such only completely updates a frame approximately once per second. This results in a blurry trail of motion that persists for a second before completely clearing (or, in this case, is overwritten by the next trail of motion). While a much smoother viewing experience, the individual frames are much more difficult to decipher compared to the clarity of the previous example’s individual shots. While the former method of film processing is significantly time-delayed but clear, the latter is smooth but at the cost of significant time-delayed motion trailing. For sequences with scrolling text and quick movement (as seen in the provided clip) this raises the question; which is preferable in video processing, clarity or smoothness?


----------begin_max5_patcher----------
537.3oc2VssiaBCD8YhT9Gr7yzU3KzP6S4+npJxAbScDXSMlDVsp+60Wfznt
IrfJoqTegQd7XelyLdlgWVuJBtW0waffOC9BHJ5EqlHuNmlnAEQvJVWdIqwa
HTxOq1eDF2umg2Y75OJLOcVHKTmAMk1Oa+VohYDxC.zEiksUpVSI23uJ7f5f
Nyy07fq.2yjGfw.HD70AaDEdTrP+gMWtuZlI+6VH1o44lvYIT5SIw.JB4DnT
rSfse+8UY8BgbvIPdk+b8JmzJh+qCD+nhazJ.JIIArkkaDm32MDfFODbS5iF
i9TZfvIId5SHuA8wKN8cuCpTmD7Ym0smbWEynEc2O2SlPtmP7hTx+7Tui6Mk
hBNXqWrqsFPRFVXKMj1kOj3R1XwED1WLP1DdafnuW0DDxCohfNF4wjr.c8OG
PnO8eVEQ5nYdRHkmQdvUDU7lF1A9qIulyJlcp91LEOAlhngm5uO885m+surU
qe9pIf3EdBHZzY.HTpeDHlFFAlMyrdvLXoP9p+Mvilai+HB0nZ04C9de+HvU
3UvarABavPIuxHGMbVc+bwjwKcJ3ksXvQmBboKFbjo.2lECN7LX2kWv5Bt9M
5grrN.4NNPxB3.nY3.2t5gUWehqa5OQ.IaijiJuOlEGVKjg0Ig0Z9IwvQ5Mg
osk2Fascq16fvtO1OtCVorLV1JFh5Nt5f12qRxrsjqYAV46osdk0feg4w9Z1
-----------end_max5_patcher-----------

]]>

ejhicks – Assignment 1: Translation Loop

Wed, 04 Sep 2019 12:53:28 +0000

The “found system” chosen for this assignment was the online translation tool Google Translate. A particular English passage was entered into the system (more specifically, the transcript of “I am Sitting in a Room”), and then the passage was translated through a series of five languages: Hebrew, Russian, Japanese, Turkish, and Latin, in that order. The passage was then translated back to English: the output was then put back into the translation tool, and translated again through the same series of languages in the same order. In total, the phrase was processed through ten loops, and was re-translated five times in each loop: a grand total of fifty translations. The final output of English is entirely distinct from the original transcription of “I Am Sitting in a Room.” Curiously, when viewing the final few translations, the corruption process slowed significantly; the individual sentences had decreased in grammatical complexity, and had thus become much more straightforward to translate between multiple languages, albeit totally separate from their original meaning. If this same loop were to be repeated another ten times, it is unlikely that the same level of distortion would be seen on the grammatically simple output.
https://drive.google.com/drive/u/1/folders/12kLM9MQpoxf7BZq89a3ht23TO-2o7oEG

]]>