video➦xy oscilloscope (audio)

I really like the xy oscilloscope.
It’s a visualization of stereo audio where the left ear’s audio is plotted on the x-axis, and the right ear’s audio is plotted on the y-axis.

I enjoy this visualization quite a lot.

These are a couple experiments that exemplify why I like the visualization. The translation is raw enough to see how individual effects applied to the audio literally change its shape. An audio signal is just a series of values oscillating between 0 and 1, so if you have two audio signals, you have enough information to represent those signals with a 2D graph, and see the interesting shapes/forms those signals represent together. If you’re deliberate with your audio,  you can even draw pictures:

But I couldn’t be bothered to be deliberate with my audio, so I figured I’d just run an edge detection algorithm on a video feed, turn the detected edges into a series of points, remap those points to a [-1,1] range, and output them as audio. This would let me take video, transform it to 2 channel audio, then graph that 2 channel audio to (somewhat) reconstruct the original video feed.

Initially, I used webcam input for the camera feed, but found that the edge detection required specific lighting to be consistent. Eventually I found a way to instead use video captured from my screen by having obs capture the top-right 600×600 pixels, where I have a window for a screen magnification tool called xzoom, so that I can zoom in on any region of the screen and have it sent as video data to be processed. Here was my final setup

and here’s an xy plot of some audio I recorded with it:
(VOLUME WARNING, THE SOUND IS QUITE HARSH)

Here, I move the zoom around my desktop background and some text in my terminal. You can see a crudely drawn cherry and dollar sign at 1:00, some stars at 1:25, and “excap3” at around 2:15. I’m very happy with how the visuals turned out, and only partially content with the audio. My process for capturing and transforming the video feed was by no means optimal, but I found most of the visual artifacts to be more interesting than obstructive. Why I’m only so-so on the audio requires getting a little technical.

Actually transforming a series of points of detected edges into audio means deciding the order in which you put those points into the audio buffers. Suppose a single frame of edge-detected video gives you like 5000 contour points. Those points represent the video data at one point in time. But in a raw signal, you can’t have 5000 values simultaneously, you can only have one value per channel of audio. You can represent 5000 values in the span of 5000 audio samples, but you have to decide which values to represent first. That decision will define what type of sound you get.

Case in point: the left-channel audio contains much more information than the right-channel audio, because the points are put into the buffer in order of their y-values. This is a consequence of using ‘numpy.nonzero’ to get points from the edge-detected frame of video. The function returns what indices in the frame have non-zero values, ordered top-left to bottom-right by row. The more points detected, the longer it will take to shove all those points through the audio buffer, and the longer it will take to get to the points at the bottom of the image, hence the longer it will take for the values in the right-channel to change. It’s a fairly interesting problem that, if addressed in some future iteration of the project, I think would make the audio much more interesting. However, my issue is mostly with how poorly the sound is distributed. The like the left channel’s sound enough that I’m still fairly happy with my results.

Here’s a short video I did exploring the tool where you can see my entire screen. I’m running the video-generated signal through my Tidalcycles/SuperDirt setup so I can apply effects (no sound warning, the audio is much quieter here).

The code is in an ungodly state, but here are the scripts I ended up using (godspeed if you try to get it working). I hope to optimize/generalize this system somehow in the future, because I’ve found this video->audio pipeline fairly interesting to explore. What do video processing effects sound like? What do audio processing effects look like? Can you turn a pure shape into a pure tone? It ties image and sound together so tightly that it creates its own little subject of study, which is the type of system I love = ]

jade-final

Visualization of Distracted Mind

This project aims to visualize the human mind that is susceptible to distraction through the use of eye-tracking technology.

The GIF above illustrates the subject’s repeated attempts to follow the Walking to the Sky sculpture through gaze on CMU campus. The lines represent the trajectory of eye gaze in each attempt. The eye gaze was tracked by a Unity application on Meta Quest Pro. Despite the straight shape of the sculpture, the visualizations show the difficulty of following a straight line in presence of distracting objects in the surroundings.

People often say that the eyes are the window to the soul. When I saw the new release of the eye-tracking feature in one of the newest AR headsets Meta Quest Pro (shown in the picture below), I became interested in using this for the investigation of humans minds, using eye gaze as a proxy to the human mind. For this investigation, I developed a Unity application that tracks the right eye of the user and draws the trajectory of the eye gaze. The app also features record & replay, which generated the output above. The video shows exactly what the wearer sees in the headset. The project and workflow was largely inspired by A Sequence of Lines Traced by Five Hundred Individuals.

Evaluating the current state of the work, it would be better to showcase the idea through a better setup for the recording. My original plan was to build a 3D eye gaze heatmap like this work by Shopify. But only after spending a lot of time experimenting the eye-tracking and passthrough features, I realized this requires a separate 3D model of the space. In the future, I’d like to investigate more into 3D eye gaze tracking and also different implementations of experiments, for example, collaborative gaze work and more scientific investigation into how shared AR artifacts in a space impact people’s perception and experience in the space.

bumble_b-FinalProject

I asked people to write me a secret— any secret— and then erase it…

Some Secrets

My idea for this website was to have a wall of secrets, almost like the wall of a bathroom stall where people say things they perhaps wouldn’t say without the guarantee of anonymity? And yet, their handwriting keeps a piece of them there forever…

When you hover over a secret on the website, it shows how the person wrote it. It’s an interesting study to see where people may pause to think, or hesitate before continuing.

I also asked the participants to erase their secret after writing it. I feel it’s cathartic to let a thought out into the world and then erase it from existence.

I’ll be honest that though I really like this idea and concept, the execution is all but complete. I’ve been having a really hard time keeping up with my life and school right now, and I started this much later than I’d like to admit. My original proposal is completely different as well, and I never got the chance to talk to Nica and Golan about my change of plans (which I so totally apologize for… my lack of communication the past couple of weeks has been completely on me). I want the chance to flesh out this project more, get more secrets, fix up the website, and make this project more than just the little demo it is now.

The past couple of weeks, I have been working on a friend’s passion project, which was to make a mini karaoke box. My portion of the project was to make a receipt printer print a little receipt of the user’s session once it was complete, with the date and time they visited, the song they chose, a random lyric from the song, and a photo of them singing!

The installation will be up and running by the night of 12/06, so stop by Purnell if you’d like!

Morning View – Final

My project is video work that meshes together audio and visual techniques. Using footage for a high-speed camera, I can achieve a slow-pacing element to the fast-paced movement. The visuals are accompanied by audio that is sourced from my archive. The video acts as an intimate self-portrait.

I was interested in furthering my process from a previous research path in my vogue performance. However, I have discovered a new version of performance within this vignette. The idea of breaking down movement is excellent for studying. I believe that I have created a sort of intimacy through a form of dance that is channeled through intensity.

This work will live in my archive and be further built into expansive projects. I don’t believe this is something I want to exhibit. it feels very personal, and I need to protect that safe space.

“assembly” Rashaad Newsome

I succeeded in experimenting and teasing out a single subject and idea. I believe I have unlocked a new potential in my media exploration. However, I think as I learn more techniques, more choreographed elements will nurture the dynamic narrative. Refinement is going to be an exciting challenge moving forward.

Inspiration + Connection

KELLY STRAYHORN FAIL-SAFE (11.11.22)

 

 

 

Skrrr- Who?

 

My exploration with Near-Infrared photography: I will go through the portraits I captured, my attempts to color these black and white images, and some of my video footage towards a narrative.

 

Near-Infrared Photographs

 

 

 

Coloring the Near Infrared images:

  • Photoshop(multicolor) with special thanks to Leo!:

 

 

  • Tranditional Printing Method: Cyanotype

 

digital negative

 

8×10 test print

 

15×11 print

 

Narratives:

cuts from footage:



kitetale – Final Project (Snapshot)

Snapshot is a Kinect-based project that displays a composite image of what was captured over time through depth. Each layer that builds up the image contains the outline of objects captured at different depth range at different time. The layers are organized by time, meaning the oldest capture is at the background and the newest capture is at the foreground.

Extending the from my first project, I wanted to further capture location over time for my final project. Depending on the time of the day, the same location may be empty or extremely crowded. Sometimes an unexpected passes by the location, but it would be only seen at that specific time. Wondering what the output would look like if each depth layer is the capture of the same location from different point of time, I developed a program that separately saves the kinect input image by depth range and compiles randomly selected images per layer from the past in a chronological order. Since I wanted the output to be viewed as a snapshot of longer interval of time (similar to how camera works but with longer interval of time beingcaptured), I framed the compiled image like a polaroid with a timestamp and location written below the image. Once combined with the thermal printer, I see this project to be at a corner of a  location with high traffic, providing an option for passerby to take a snapshot of the location over time and take the printed photo as a souvenir.

Overall, I like how the snapshots turned out. Using the grayscale to indicate the depth and the noise in the pixel made the output image look like a painting that focuses on the shape/silhouette rather than the color. Future opportunity wise, I would like to explore further with this system I developed and capture more spaces over longer period of time. I’ll also be looking into migrating this program into raspberry pi3 so that it can function in a much smaller space without having to supervise the machine while it’s in work.

Example Snapshots:

Individual snapshots

Process

I first used Kinect to get the depth data of the scene:

Then I created point cloud with the depth information as z:

I then divided the points  into 6 buckets based on their depth range with HSB color value (also based on depth).

I also created triangular mesh out of points per bucket.

I wrote a function that would automatically save the data per layer every certain time interval (e.g. every minute, every 10 seconds, etc.)

These are different views of Kinect:

Upon pressing a button to capture, the program generates 6 random numbers and sorts them in order to pull captured layers in chronological order. It then combines the layers all together by taking the biggest pixel value across 6 layer images. Once the image is constructed, it frames in a polaroid template with location and timeframe (also saved and retrieved along with the layers) below the image.

 

MarthasCatMug – Final: Volumetric Video of Hair

For my final, I revisited my goal of capturing hair. I aimed to use photogrammetry in video form (also called “volumetric” or “4D” video) to try and capture moving hair. There were a lot of unknown factors going into the project that attracted me. I didn’t know how I was going to obtain many cameras, how I could set up a rig, how I could do the capturing process, or how I could process the many images taken by the many cameras into 3D models that could become frames. I wasn’t even sure if I’d get a good, bad, or an unintelligible result. I wanted the chance to do a project that was actually experimental and about hair though.

In preparation for proposing this project I was looking into the idea/concept of hair movement and on that subject, what I found were mostly technical art papers on hair simulation (ex. this paper talks about obtaining hair motion data through clip in hair extensions). Artistically though, I found the pursuit of perfectly matching “real” hair through simulations a bit boring. I want the whimsy of photography and the “accuracy” of 3d models at the same time.

21 photos, 18 aligned, very big model, about 425k vertices

My process started with an exploration of the photogrammetry software: Agisoft Metashape which comes with a very useful 30-day free trial in the standard version. I experimented around with taking pictures and videos to get the hang of the software. My goal here was to see if I could find the fewest amount of photos (and therefore cameras) that would be needed to create a cohesive model. It turns out that number is somewhere just below 20 for a little less than 360 degree coverage.

I was able to borrow 18 Google Pixel phones (which all had 1/8th, 240 ftps slow motion), 18 camera mounts, a very large LED light, several phone holders, a few clamps, and a bit of hardware from the Studio. I was then able to construct a hack-y photogrammetry setup.

Since the photogrammetry rig seemed pretty sound, the next step was to try using video. After filming a sample of hand movements, manually aligning the footage and exporting each video as folders of jpegs, I followed the “4D processing” Agisoft write-up. This- no joke- took over 15 hours (and I didn’t even get to making the textures).

manually synchronizing video

hand test, 720 frames (this was overkill)

Aligning the photos took a few minutes (I was very lucky with this); generating a sparse point cloud took a bit over an hour; generating the dense point cloud took four; and generating the mesh took over 10. I didn’t dare try to generate the texture at that point because I was running out of time. I discovered here that I’d made a few mistakes:

  1. I forgot the setup I made is geared towards an upright object that is centered and not hands so this test was not the best to start with
  2. Auto focus :c
  3. Auto exposure adjustment :c
  4. Overlap should really be at about 70%+
  5. and “exclude stationary tie points” is an option that should only be checked if using turntable

So, what next? Cry ? yes :C but also I did try to wrangle the hair footage I have into at least a sliver of volumetric capture within the time I had.

I think that in a more complete, working, long form, I’d like for my project to live in Virtual Reality. Viewing 3D models on a screen is nice but I think there is a fun quality of experience in navigating around virtual 3D objects. Also, I guess my project is all about digitization: taking information from the physical world and not really returning.

Final Project

Video

A few weeks ago, I was roaming around Hunt Library (I think I was angry at something and tried to escape the MFA hallway). I then came across this worn-out massage chair, in the quiet work zone on the third floor. One of my cohort members once enthusiastically told me about one of those, located at Mellon Institute, so I was eager to try. I put on the “full body” program. It was very thorough, almost aggressive. I almost couldn’t believe this machine is massaging my butt while I’m sitting in a public space. Nobody around seemed to care. I was amused and fascinated, I felt like I had a secret. I wasn’t angry anymore.

There are some dualities to this chair that pulled me in and made me want to engage with it: it’s performing an intimate act, but casually installed in a public workspace. It’s a choreographed machine, directed for physical sensation, designed to release physical tension. There is something sensual about it, almost perverse, and deeply comical and paradoxical.

I went out to film a person wearing a green screen bodysuit while getting a full-body massage from the chair. I was afraid that my friend, who agreed to help me and wear the suit, will feel suffocated, so I told him he can stop at any time, and that I only need a few seconds of footage. But he actually enjoyed the massage, said that the suit is surprisingly breathable, and asked for another full-body round.

In a classic idiotic moment, that perhaps had something to do with the fact that I didn’t ask for permission to film in Hunt and I kept waiting for a staff member to walk by and see what we were doing, I forgot the most basic rule of shooting with a green screen suit – and I moved my cameras between shots. After my friend left, I did my best to reposition them as close to the shots with him, but the footage don’t exactly align. Choosing to start filming at 1 pm was another mistake, because by the time I filled the last shots, the outside light, well viewed due to the chair’s position against big windows, has already changed.

I think of this project as a primal experiment, that surprisingly (to me) relates to themes I have worked around in my practice before (like a blind participant in an impressive experience as a performer). It is not precise or whole at the moment, but I did find some significant points of interest in the process and in the resulting footage.

A bit from behind the scenes:

Behind the scenes

Qixin‘s Final Project

Centri-Petal

time-based & mix-media art

Qixin Zhang, Dec 2022

I am interested in the “in-between” of natural/organic/physics movement (in this case, water movement / turning movement / glass reflection) and technological/synthesis/computational movement (slit-scan program, light changing pattern). How water/waves/light/reflectin move/ how pixel move.

Experiment & exploration of a workflow – integrate Japanese ink painting skill suminagashi (including the process, the result) /  other live image making, and visual for live performance.

set-up:

mounted camera – cake stand – laptop – touchdesigner – projector

night version:

slit scan has an element of time: glass with smooth light changing pattern

strobe light, with sound

 

change light direction

 

day version

suminagashi : With ink and water spread out, it creates a figure-ground relationship and temporal relationship. slit scan :direction of the movement and time. 

 

result on paper:

 

Special thanks: STUDIO of Creative Inquiry, Prof. Golan Levin, Prof. Nica Ross, Kyle McDonald, Matthias Neckermann

 

 

 

 

 

 

 

 

miniverse – bad deepfake tennis

inspo:

https://courses.ideate.cmu.edu/60-461/f2022/hunan/11/03/hunan-personintime/

https://courses.ideate.cmu.edu/60-461/f2022/skrrr/11/08/lost-in-time/

what if I offset a segmentation map by a constant in time?

Find a location to best insert this method.

I was thinking tennis bc contact point w ball is important

plan: use yolov5 to segment ball out. It contains a “sports ball” category that is basically a high level circle detector

replace some of the code in yolo to spit out a segmentation mask I can then use:

^use mask to remove ball and inpaint video using cv2 inpaint

People missing their tennis forehands(?) backhands(?) 

final result! frame offset is 20.