FinalProjectDraft – Experimental Capture

Project Proposal (Final) Synth Play

Pipeline/s

Overall (Edited) Pipeline:

Mic – computer – little bits (arduino + cv) – audio amplifier – arp

Inside the Computer: Teachable machine to p5.js to max msp out to the analog pipeline

Themes:

Dealing with Loss – Audio Degeneration

My concept is focusing on the idea of losing someone close to you for whatever reason and I want to make a performance out of this using the arp 2600. The digital (capture ) component of this is focusing on removing various phonemes from my voice in real time to either single out or remove completely the specified sounds.

First step – get the system working to use cv to send out different qualities of the signal as a cv to control the arp. This is working

https://drive.google.com/file/d/1XrmtC7oAI06D0D4_hy1Dk09lZ60820F4/view?usp=drive_link

Training using teachable machine – finding it’s quirks – consistency in volume I think vocal dynamics isn’t such a great way to train this model – prediction is that it becomes confused

Current headache – Max has odd syntax quirks that aren’t currently compatible with the arduino syntax <numbers>, however they definitely want to talk to each other.There is some conversion edit I have to make. When I send information i get an error which ends the process of sending the numbers, but I get a blink confirming that my setup is almost correct. – just found out / solved it !!!

Next steps – combining the teachable machine that is in p5.js(thanks golan), into max, then getting an output – transforming that output and hence sending it out to the arp. Performance (yay)

Motion Capture Final Project

Objective

I aimed to explore capturing dance movements through a Motion Capture (Mocap) system, focusing on understanding its setup and workflow while creating an animation from the captured data.

Process

System Testing:

I used the Mocap Lab in the Hunt Library Basement. There are 10 motion capture cameras mounted to capture movement in the space.

Challenges and Adjustments:
Calibration was essential for accurate capture, involving a wand with sensors to determine the cameras’ locations.
Initial calibration was poor due to system neglect.
- Solution: Adjusted camera positions to improve calibration.
- Result: Calibration accuracy improved but hardware issues persisted, making complex motion capture difficult.
https://drive.google.com/file/d/1kRd9X2ERyBjxDxj7PbBXGGeyQ92Uary9/view?usp=sharing

Recording:

I invited a friend, a costume design student from the School of Drama, to perform a short ballet routine for capture.

Challenges:
- Hardware instability led to unreliable data capture.
- Export of the ballet data was unsuccessful due to system restrictions.
- Recorded video of the session was preserved for reference.

Rigging:

First Attempt:

https://drive.google.com/file/d/117iZ76MnFCeKrLIg_D61js7rksitm501/view?usp=drive_link

Second Attempt:

- - Model: Downloaded a pre-rigged character from Mixamo.
  - Data: Used test data due to the ballet motion file’s export failure.
  - Outcome: Successfully animated the pre-rigged model using test data.

Next Steps

Locate the original ballet motion data and reattempt the export process.
Redo rigging and animation with the captured dance motion.
Explore finding a model that better aligns with the my conceptual design and hopefully build a small scene.

Special Thanks to Sukie Wang – Ballet Performer

Final Project – Lingering Memories

WHAT

Previously…

Person In Time_ Cooking in Concert

I want to capture these memorable last moments with my college friends in a form where I can visit later on in the years when I want to reminisce.

I decided to document a representative “group activity” which is how me and my roommates move around and work together in our small kitchen by setting up a GoPro camera and observing our moving patterns.

I then recreated a 3d version of scene so that we can view the movements more immersively. I wanted to work with depth data in this next iteration of the project, so I used a Kinect this time.

Azure Kinect

Design and validation of depth camera-based static posture assessment system: iScience

Kinect is a device designed for computer vision that has depth sensing, RGB camera, and spatial audio capabilities.

My Demo

WHO

After my first iteration capture of a moment when we cook together, I wanted to choose another moment of interaction. With the Kinect I wanted to spatially capture my interaction with my friends when we regularly meet every Saturday 7pm at our friend’s apartment to watch k-dramas together. The setting is very particular because we watch in the living room on the tv we hooked up the computer to and all four of us sit in a row on the futon.

I was especially drawn to this 3D view of the capture and want to bring it into Unity so I could add additional properties like words from our conversation and who is addressing who and so on.

HOW

Now comes my struggle…:’)

I had recorded the capture in a file called mkv, which is a format that includes both depth and color data. In order to bring it into Unity to visualize this I would need to transform each frame of data as a ply, or point clouds.

I used this Point Cloud Player tool by keijiro to display the ply files in Unity. And I managed to get the example scene working with the given files.

However, I faced a lot of trouble converting the mkv recording into a folder of ply files. Initially, it just looked like this splash of points when I opened it in Blender.

After bringing it into MeshLab and playing with the colors and angles, I do see some form of face. However, the points weirdly collapse in the middle like we are being sucked out of space.

Nevertheless I still brought it into Unity, but the points are very faint and I could not quite tell if the points above are correctly displayed below.

Next Steps…

Find alternative methods to convert to ply files
1. Try to fix my current python code
2. Or, try this transformation_example on the Github (I am stuck trying to build the project using Visual Studio, so that I can actually run it)
Bring it into Unity
Add text objects for the conversations

Final Project Draft: Bridge Typology Revision

For my final I decided to build upon my typology from the beginning of the semester. If you’re curious, here’s the link to my original project.

In short, I captured the undersides of Pittsburgh’s worst bridges through photogrammetry. Here’s a few of the results.

For the final project, I wanted to revise and refine the presentation of my scans and also expand the collection with more bridges. Unfortunately, I spent too much time worrying about taking new captures/better captures that I didn’t focus as much as I should have on the final output, which I wanted to ideally be either augmented reality experience, or a 3d walkthrough in Unity. That being said, I do have a very rudimentary (emphasis on rudimentary) draft of the augmented reality experience.

https://www.youtube.com/watch?v=FGTCfdp8x4Q

(Youtube is insisting that this video be uploaded as a short so it won’t embed properly)

As I work towards next Friday, there’s a few things I’d like to still implement. For starters, I want to add some sort of text that pops up on each bridge that has some key facts about what you’re looking at. I also currently don’t have an easy “delete” button to remove a bridge in the UI, but I haven’t figured out how to do that yet. Lower on the priority list would be to get a few more bridge captures, but that’s less important than the app itself at this point. Finally, I cannot figure out why all the bridges are somewhat floating off the ground, so if anyone has any recommendations I’d love to hear them.

I’m also curious for feedback if this is the most interesting way to present the bridges. I really like how weird it is to see an entire bridge just like, floating in your world where the bridge doesn’t belong, but I’m open to trying something else if there’s a better way to do it. The other option I’m tossing around would be to try some sort of first person walkthrough in unity instead of augmented reality.

I just downloaded Unity on Monday and I think I’ve put in close to 20 hours trying to get this to work over 2.5 days… But after restarting 17 times I think I’ve started to get the hang of it. This is totally out of my scope of knowledge, so what would have been a fairly simple task became one of the most frustrating experiences of my semester. So further help with Unity would be very much appreciated, if anyone has the time!! If I see one more “build failed” error message, I might just throw my computer into the Monongahela. Either way, I’m proud of myself that I have a semi functioning app at all, because that’s not something I ever thought I’d be able to do.

Thanks for listening! Happy end of the semester!!!!

Final Project | Walk on the earth

Concept

I seek to pixelate a flat, two-dimensional image in TouchDesigner and imbue it with three-dimensional depth. My inquiry begins with a simple question: how can I breathe spatial life into a static photograph?

The answer lies in crafting a depth map—a blueprint of the image’s spatial structure. By assigning each pixel a Z-axis offset proportional to its distance from the viewer, I can orchestrate a visual symphony where pixels farther from the camera drift deeper into the frame, creating a dynamic and evocative illusion of dimensionality.

Capture System

To align with my concept, I decided to capture a bird’s-eye view. This top-down perspective aligns with my vision, as it allows pixel movement to be restricted downward based on their distance from the camera. To achieve this, I used a 360° camera mounted on a selfie stick. On a sunny afternoon, I walked around my campus, holding the camera aloft. While the process drew some attention, it yielded the ideal footage for my project.

Challenges

Generating depth maps from 360° panoramic images proved to be a significant challenge. My initial plan was to use a stereo camera to capture left and right channel images, then apply OpenCV’s matrix algorithms to extract depth information from the stereo pair. However, when I fed the 360° panoramic images into OpenCV, the heavy distortion at the edges caused the computation to break down.

Moreover, using OpenCV to extract depth maps posed another inherent issue: the generated depth maps did not align perfectly with either the left or right channel color images, potentially causing inaccuracies in subsequent color-depth mapping in TouchDesigner.

Fortunately, I discovered a pre-trained AI model online Image Depth Map that could directly convert photos into depth maps and provided a JavaScript API. Since my source material was a video file, I developed the following workflow:

Extract frames from the video at 24 frames per second (fps).
Batch processes 3000 images through the Depth AI model to generate corresponding depth maps.
Reassemble the depth map sequence into a depth video at 24 fps.

This workflow enabled me to produce a depth video precisely aligned with the original color video.

Design

The next step was to integrate the depth video with the color video in TouchDesigner and enhance the sense of spatial motion along the Z-axis. I scaled both the original video and depth video to a resolution of 300×300. Using the depth map, I extracted the color channel values of each pixel, which represented the distance of each point from the camera. These values were mapped to the corresponding pixels in the color video, enabling them to move along the Z-axis. Pixels closer to the camera moved less, while those farther away moved more.

To enhance the visual experience, I incorporated dynamic effects synchronized with music rhythms. This created a striking spatial illusion. Observing how the 360° camera captured the Earth’s curvature, I had an idea: what if this could become an interactive medium? Could I make it so viewers could “touch” the Earth depicted in the video? To realize this, I integrated MediaPipe’s hand-tracking feature. In the final TouchDesigner setup, the inputs—audio stream, video stream, depth map stream, and real-time hand capture—are layered from top to bottom.

Outcome

The final result is an interactive “Earth” that moves to the rhythm of music. Users can interact with the virtual Earth through hand gestures, creating a dynamic and engaging experience.

Critical Thinking

Depth map generation was a key step in the entire project, thanks to the trained AI model that overcame the limitations of traditional computer vision methods.
I feel like the videos shot with the 360° camera are interesting in themselves, especially the selfie stick that formed a support that was always close to the lens in the frame, which was very realistic and accurately reflected in the depth map.
Although I considered using a drone to shoot a bird’s-eye view, the 360° camera allowed me to realize the interactive ideas in my design. Overall, the combination of tools and creativity provided inspiration for further artistic exploration.