Bodily Voices-Final Project

Bodily Voices

In the previous investigations via Eulerian Video Processing and ultrasound sensors, I saw the minute changes of my body–I saw how my pulse move and my heart beats with me completely unaware. It really felt as though the body is another creature separate from me.

So I wanted to further investigate the pulsing of the body and hear it, receiving information from it as though conversing with another entity.

Further Eulerian Works

Work using low ram computer, scale factor (similar to resolution to get better images) = 0.5 out of 1.

My Process:

Stabilize –> Magnify

Holding still is really hard! I used tracking stabilization to see if it makes a difference–for most videos, it did.

I then changed up the levels of all the parameters to see the differences they made. In the higher frequency ones you can see it moving really fast, which is still slightly confusing to me as my pulse normally doesn’t move that fast.

PARTS:

These are just all the parts I did, scroll through, though the ear is particularly interesting.

Parts that didn’t work well:

In these I felt that the image did not look great. I wanted to use a higher scale factor but:

My computer’s ram is far too low to support that. So I moved to the school’s computer.

New videos

These used the same videos but just better scale factor. It is interesting how as the scale factor moves up the exaggeratedness of the magnified movements goes down.

Myoware + hearing muscles

I tried so many wirings to make it work. It took me so long because I read that you should use an isolator for these sensors for them to work safely. I had the adafruit isolator and kept trying different setups with a pi so that the power goes through the isolator, but nothing worked.

So I ended up just not using an isolator. Fortunately (but also annoyingly), it worked fine :-(. I read through the sensor and max/msp reads a txt file to make these sounds according to the output.

Final Progress

I very wrongly used my final draft to write my final project text so this might look different from how I presented it.]=

I wanted to further investigate the pulsing of the body and hear it, receiving information from it as though conversing with another entity.

What didn’t work:

myoware

lytro

I felt that the image did not look great, wanted to use a higher scale factor but:

My computer’s ram is far too low to support that.

Further Eulerian Works

Work using low ram computer, scale factor (similar to resolution to get better images) = 0.5 out of 1.

My Process:

Stabilize –> Magnify

Holding still is really hard! I used tracking stabilization to see if it makes a difference–for most videos, it did.

These are just all the parts I did, scroll through, though the ear is particularly interesting.

Parts that didn’t work well:

Final Proposal

1. Just using the eulerian magnification to explore different small movements, coupled with tracking/stabilization. People also said that sound would be interesting (movements of electronics + sound captured by I forgot what that device is called)

2. Body watch? Somehow real time image but still small enough to wear.

3. I wonder if I can set up a larger scene, i.e., a silent gathering around the table with participants unmoving, and see how their blood pulses align by processing the video so that only a small frame of skin is magnified on each person. Still life-ish setting and lighting.

Person in Time — bodily units of time

Bodily units of time

This project went a little out of my control and is all over the place. It ended up consisting of 3 parts:

(1) the measure of blinks over 2 hours, (2) the measure of heartbeats via ultrasound, (3) eulerian magnification to show wrist pulses.

Though they use different tools and focus on different subjects, the main idea behind them is relatively uniform: discarding arbitrary / artificial units of time, to use the body as a natural clock, and explore our bodily portrayal and experience of time.

Scripts and video process files can be found in a google drive link at the end of this documentation.

Blinks!

Credit: Thank you Golan for the idea of creating the grid based on the duration between the blinks! So sorry I forgot to mention during the presentation.

This grid used ffmpeg. The blinks were identified and cropped using the python libraries scipy.spatial (https://docs.scipy.org/doc/scipy/tutorial/spatial.html) and imutils (https://pypi.org/project/imutils/).

Process:

I was able to get this very cool phone holder

Unfortunately, I ended up changing the setup because I had trouble sitting still for 2 hours just staring at my face in the phone. The phone’s position blocked me from seeing anything in front of me, and if I lower my eyes the blinks have a high rate of going undetected. So I switched to using the magic arm to set the phone right behind my computer’s camera, and watched videos for 2 hours on my computer. There still was a few lowered eyes and more shifting face positions.

This is far from finished. The cropping takes a long time, as of yesterday midnight, it has only gotten to “2518.6283333333336 to 2522.2250000000004″(around the 1200th blink). We have the timestamps of around 3500 blinks in total. The grid contains only 362 blinks because the cropping tool used failed to identify many of the eyes (choosing to crop my eyebrows, a pimple on my cheek for some reason and sometimes my nose). Requires further improvement.

Additionally, it would be great if I can dynamically crop the eyes (tracking the eye’s position so it always stays at the center of the cropped frame). However, the processing would take too much time.

Alternative capture method:

https://openprocessing.org/sketch/2440791

This is a p5.js script I made based on Golan’s Face Metrics code, it downloads the timestamps and the webcam recording once you exit the page.

Ultrasound heart

I thought this was one of the more interesting ones, you can see the valves moving .

Here is four views in a grid:

Single movement:

Video Player

Media error: Format(s) not supported or source(s) not found

Download File: https://courses.ideate.cmu.edu/60-461/f2024/wp-content/uploads/2024/11/clip_3.mp4?_=1

00:00

Use Up/Down Arrow keys to increase or decrease volume.

I was then able to extract 220 videos of heart movements from a 5 minute video by calculating the average intensity of each frame (how bright each frame is, 0=black, 255=white, 100-150=grey) and finding peaks in the intensity, therefore mimicking heartbeats. The above us a single movement of the heart.

Personally I think it looks less intriguing than the eye grid. Mainly due to them mostly having the same length and movement.

Pulse

I captured the pulse by processing videos using eulerian color magnification. The video below explains it very well

Rather than tracking individual pixels, it tracks the variations in fixed grid cells. Basic processes: YIQ color space –> Gaussian Pyramid Level of each frame –> Magnify the filtered Pyramid Levels –> Add magnified levels to original frames –> bring back to RGB.

Honestly: I only know that this is how the code is structured accordingly. I have no idea how the theory behind it works.

Setup

Initially I got videos like this:

Process recordings with script. Make grids as needed with ffmpeg.

I tested different alpha levels

Tried making a 20 minutes recording to get the heart rate. But I didn’t anticipate how hard it is to keep your arm still for 20 minutes. The intensity processing based on this was disappointing.

Pulse graph via intensity:

Still, I am so excited to have got this to work.

Google drive documentation: TBA

End notes:

Would love to play with the ultrasound more, its so unsettling seeing your (processed) insides in real time. Its amazing how tangible our insides actually are but we go about unaware of them most of the time. Suddenly they’re extremely prominent and heavy.
The eulerian magnification is amazing. I really like the 2×3 grid with all the pulses moving together. I wonder if I can set up a larger scene, i.e., a silent gathering around the table with participants unmoving, and see how their blood pulses align by processing the video so that only a small frame of skin is magnified on each person. Possible final project?

Person in Time proposal

Seconds, minutes, hours. Familiar units of time are arbitrarily defined. The experience of time is different.

In the project:

1. capture different movements of the body: breath, blinking. pacing, heartbeats as the units of time.

blinking: mounted small camera in front of head

walking: similar but maybe on legs

breath: rise and fall of a capture device while lying down

heartbeat: ultrasound

(Revealing Invisible Changes in the World)
(Pulse Room)

2. software component: finding timestamps of the actions and slicing + assembling accordingly. (would start blinking at the same time, a page of eyes blinking at the same time, but with different intervals)

3. Open to other ways of assembling

the time of different body parts synced.

Looking Outwards4

Quantified Self Portrait (One Year Performance). Michael Mandiberg. 2017.

Quantified Self Portrait (One Year Performance) is a frenetic stop motion animation composed of webcam photos and screenshots that software captured from the artist’s computer and smartphone every 15 minutes for an entire year; this is a technique for surveilling remote computer labor.

Quantified Self Portrait (Rhythms). Michael Mandiberg. 2017.

Quantified Self Portrait (Rhythms) sonifies a year of the artist’s heart rate data alongside the sound of email alerts. Mandiberg uses himself as a proxy to hold a mirror to a pathologically overworked and increasingly quantified society, revealing a personal political economy of data. The piece plays for one full year, from January 1, 2017 to January 1, 2018, with each moment representing the data of the exact date and time from the previous year.

Excellences and Perfections. Amalia Ulman. 2014.

https://webarchives.rhizome.org/excellences-and-perfections/

4 month performance on instagram, fooled her followers into believing in her character and following her journey from ‘cute girl’ to ‘life goddess’. Bringing fiction to a platform that has been designed for supposedly “authentic” behaviour, interactions and content’

EEG AR: Things We Have Lost. John Craig Freeman. 2015.

https://johncraigfreeman.wordpress.com/lacma-art-technology/

Freeman and his team of student research assistants from Emerson College interviewed people on the streets of Los Angeles about things, tangible or intangible, that they have lost. A database of virtual lost objects were created based on what people said they had lost, as well as avataric representations of the people themselves.

I thought these might not be related by had them on:

Clocks. Christian Marclay. 2010.

24-hours long, the installation is a montage of thousands of film and television images of clocks, edited together so they show the actual time.

Cleaning the Mirror. Marina Abramovic. 1995.

Different parts of a skeleton – the head, the chest, the hands, the pelvis, and the feet – on five monitors stacked on top of each other forming a slightly larger than life (human) body. On the parts of the skeleton one sees the hands of the artist scrubbing the bones with a floorbrush.

Inside the Chute

I recorded the sound of trash bags falling inside the garbage chute of my apartment.

Here are some questions and objectives I had before starting the project:

How can I find a recorder that can be (relatively) safely attached to the walls of the chute and have a storage and battery of its own?
Can I get a camera inside the chute to capture the bags falling?
What type of microphone should I use to get as close as I can to the experience of being inside the chute?
How can I extract individual clips of the bags falling
Where do the bags go?

00:00 00:22 17:29 20:44

This final version is made using a small recorder from Amazon, ffmpeg, and Sennheiser’s AMBEO Orbit.

Around 240 clips are extracted by limiting the shorted lengths and db level, so that ffmpeg extracts every clip above -24db and longer than 4seconds. From there I manually sorted through them to find 160 that are actually garbage bags falling. Some clips are loud but not garbage bags falling, such as the rustling of the ziploc bag as I tape the recorder to the chute.

Then the clips are concatenated from lowest db level to highest. I used google text to speech AI plugin with ffmpeg to insert a robot voice saying the numbers before each clip.

AMBEO Orbit is a VST Plugin that can be used inside audition to imitate the effect of a binaural recorder. I manually edited the concatenated clip to achieve a similar effect.

All scripts were written with the help of ChatGPT and YouTube tutorials.

I also tried versions of surround sound, binaural effect using pydub, concat from short to long, and concat without the robot voice counter. You can find them and the python scripts I used here:

(Link to google drive)

What I tried but didn’t work out:

Night Vision camera

The bags falling did not trigger the camera to capture, while the recorder captured sound of bags falling during the same time period.

2. Training a model

tutorial for reference

I followed tutorials on YouTube to train a Tensorflow model with the 160 recordings of bags falling, and around 140 clips of silence and other loud but not bags falling recordings. It is meant to learn from the spectrograms like this:

example of my result from the preprocessing function of the training model

I got a model, but it failed to recognize the sound of garbage bags falling in a 2h long recording (which I had left out of the training data).

Due to the time limit, I don’t understand the vocabulary tensor flow uses (like these), which could be the issue:

It may also be the training clips I used that are not well chosen. This will take more time to learn and change.

3. Graphing

I was able to use Matplotlib with ffmpeg to dynamically graph the timestamps of the video at which the garbage bags were falling. This would work better if the learning model could successfully distinguish garbage bags from other sounds.

Video Player

Media error: Format(s) not supported or source(s) not found

Download File: https://courses.ideate.cmu.edu/60-461/f2024/wp-content/uploads/2024/10/graph.mp4?_=2

00:00

Use Up/Down Arrow keys to increase or decrease volume.

4. Anaglyph binaural plugin

This is also a VST3 plugin that has a few more features than AMBEO Orbit, but it did not show up in audition or garbage band after installing.

http://anaglyph.dalembert.upmc.fr/index.html

Typology Machine: Work in progress

Garbage ASMR?

Recording the sound of garbage bags dropping down the trash tube in my apartment, ideally using multiple microphones to create a sound field, or securely attaching a binaural microphone to the wall of the tube on the top or bottom floor.

2. A infrared or other type of specialized camera (wildlife camera) that records a short video of the trash bags falling down the tube from the top of the tube. (Unclear whether filming the contents of the trash is legal).

3. Ideally, matching the audio with the image.

Ideas from lecture:

Subtitles of the contents of the trash paired with the image of trash falling.
Garbage remix.
Presentation.
Android phones taped to wall.
making a spy bug with circuit playground/some other small physical computing stuff?

Reading 1: Photography and Observation

Artistic opportunity:

Inherent and observational biases (although inherent bias would be more interesting for me). The creation of a single photography over a length of time, potentially capturing the transformation of an environment of object (there is a piece in the Carnegie Museum of Art, in which the artist showed several cloths pieces that were stained by a slowly corroding ecosystem (Lucy Raven I think)). Patterns, finding the line inside a bunch of random dots.

Comments:

I’ve always found the idea of complete passivity very strange because it only gets as far as to being an imitation of the human vision and understanding of the world. Such passive observation is only objective to the extent that it is universal to the human senses—in a perfect world. In our world, what is “universal” to human senses is simply defined by a dominating group of people that has say in the matter.

In a similar sense, the camera cannot really see. It collects data and visualizes such data in a way optimized to human vision (or hearing). Nowadays, with common-day photography being almost completely electronic, the physicality of photogrammetry in the late 1800s is a reminder of how arbitrary it can be. The discussion on astronomy reminded me of earlier times: with Brahe and Kepler (16th and 17th century) there was also a huge reliance on their measuring apparatuses. In their case it was the accuracy of huge system of whatever giant rulers they used to measure the position of stars. Here, rather than directly measuring the doom, the sky is transitioned, either pictorially or unconventionally, onto another medium and then measured, bringing new risks.

It doesn’t seem to me that we really moved from ideal/reasoned representations to specific/particular ones. The chase to formulate and perfect overarching truths led to the transformation of apparatus and method, which in turn allowed the discovery of more particulars and “random” cases (and then comes the problem of particulars to universal, the problem of induction in science and so on). Seeing the collections of photos from page 44 onwards felt like reading statistical graphs. In my very shallow impression of scientific methods, the goal of collecting specific cases seemed to always be to create a better version of the ideal.

Pocket Postulating

EM3D: Ethan Makes 3D Scanner

This was the app I wrote about in looking outward 3 and had used it to make some incomplete scans of myself. The scans below of stuffed animals are still not great conventionally, but I found that rotating the object itself works better than moving the phone around it if you’re scanning alone. The app works for human figures if there’s someone else there to scan you, and small objects in a clean environment. The app does a decent job of capturing the fur. In the second scan of the fox, I was holding it by the head, and the scanner captured some of my hand and clothing. It makes the object look like a planet with satellites floating around it. For projects we may want cleaner scans.

Video Player

Media error: Format(s) not supported or source(s) not found

Download File: https://courses.ideate.cmu.edu/60-461/f2024/wp-content/uploads/2024/09/Scene_02-1.mp4?_=3

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Video Player

Media error: Format(s) not supported or source(s) not found

Download File: https://courses.ideate.cmu.edu/60-461/f2024/wp-content/uploads/2024/09/Scene_03-1.mp4?_=4

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Slow Shutter

I used the two different modes of the camera: motion blur and light trail. Good subjects would be fast moving (or just moving, since the shutter speed and blur strength can be changed) objects, and for the light trail mode glowing objects would be an optimal choice.

The method really depends on whether the subject is fixed / in motion. If the object is moving then a fixed camera can produce the effect of only the subject being in motion while everything else stands still (the case of me shaking my hair and the bear moving around). In the light trail mode where I photographed my light, I moved the camera since my light is unmoving.

Ghost Vision

I think the SLS camera estimates how far away an object is (LIDAR) and detects people shaped stuff using an AI(SLS?), and marks them with stick figures. It is slightly creepy, but the ultimate effect is that it kept on marking my shelf and other things in my room as a “ghost”. It would be interesting if it the stick figures can be manipulated, but I did not find such function on the app. Perhaps a better LIDAR camera that could record more details would be good for creating nuanced shaded recordings / photos of messy environments.

Video Player