kitetale – Final Project (Snapshot)

Snapshot is a Kinect-based project that displays a composite image of what was captured over time through depth. Each layer that builds up the image contains the outline of objects captured at different depth range at different time. The layers are organized by time, meaning the oldest capture is at the background and the newest capture is at the foreground.

Extending the from my first project, I wanted to further capture location over time for my final project. Depending on the time of the day, the same location may be empty or extremely crowded. Sometimes an unexpected passes by the location, but it would be only seen at that specific time. Wondering what the output would look like if each depth layer is the capture of the same location from different point of time, I developed a program that separately saves the kinect input image by depth range and compiles randomly selected images per layer from the past in a chronological order. Since I wanted the output to be viewed as a snapshot of longer interval of time (similar to how camera works but with longer interval of time beingcaptured), I framed the compiled image like a polaroid with a timestamp and location written below the image. Once combined with the thermal printer, I see this project to be at a corner of a  location with high traffic, providing an option for passerby to take a snapshot of the location over time and take the printed photo as a souvenir.

Overall, I like how the snapshots turned out. Using the grayscale to indicate the depth and the noise in the pixel made the output image look like a painting that focuses on the shape/silhouette rather than the color. Future opportunity wise, I would like to explore further with this system I developed and capture more spaces over longer period of time. I’ll also be looking into migrating this program into raspberry pi3 so that it can function in a much smaller space without having to supervise the machine while it’s in work.

Example Snapshots:

Individual snapshots

Process

I first used Kinect to get the depth data of the scene:

Then I created point cloud with the depth information as z:

I then divided the points  into 6 buckets based on their depth range with HSB color value (also based on depth).

I also created triangular mesh out of points per bucket.

I wrote a function that would automatically save the data per layer every certain time interval (e.g. every minute, every 10 seconds, etc.)

These are different views of Kinect:

Upon pressing a button to capture, the program generates 6 random numbers and sorts them in order to pull captured layers in chronological order. It then combines the layers all together by taking the biggest pixel value across 6 layer images. Once the image is constructed, it frames in a polaroid template with location and timeframe (also saved and retrieved along with the layers) below the image.

 

kitetale – Final Project Proposal

For the final project, I’ll be creating a machine that prints out the space over time. Using Kinect, each image captured over time will be saved in the database, divided up by the depth of the object. Images will be processed to only have the object contours for simplicity. The user then can press the button to print out what Kinect has been seeing since the past till the moment the button was pressed in one image with a timestamp. The prints this machine will provide will be in a form of generative collage in a sense that objects to be selected for each depth would be chosen at random among the collection.

In terms of where this project would live, I imagine this machine to be placed at a high-traffic public location like a subway station. In these public spaces, many people with diverse range of belongings travel across the same space over different time. Capturing what use to be there over different period of time in one frame and delivering it on a physical paper upon anyone’s request would give a greater perspective to the audience on how diverse of the crowd and objects we’re sharing the common space with.

kitetale – Person In Time

Cre·​ate (verb) is a participatory video installation that invites the audience to create their own butterfly that follows the path of their hands using their hands. Regardless of background, anyone can create a butterfly of their own that flies along a distinctive path they defined.

Inspiration:

As AI based art generation tools like Dall E become more accessible to the public, I started to wonder what it means to create. What makes one to be the author of the creation? If an audience prompted the software like Dall E, which is developed by another person, and the software generates the artwork, does this work created by the audience or a developer or the software? Through this project, I wanted to explore what makes creation belong to an individual and how much of an authorship do the viewers take into consideration when seeing a collection of generated creations.

What I decide to capture:

Since creations are often made with hands, I decided to capture the movement of the creator’s hand during the act of creation. I also decided to make this an interactive audience participatory project instead of a project focusing on an artisan’s hands to challenge the general public’s misbelief in art making that only artists can make art.

Inspiration work:

Design-IO’s Connected Worlds project from 2015 is a real-time interactive piece that captures people’s pose and motion along with object placement then responds with changes in visuals on surrounding screens. Although my project doesn’t have a real-time interaction component that allows audience to change the positions of butterflies at any given time, both projects embodies the idea of inviting the audience to make changes in the piece.

Capture System Workflow:

To make audience be the creator, I developed a workflow where the creation happens upon a hand motion of opening a fist is detected (as if you’re showing something that you caught or made with your hand).

I used mediapipe and open CV to gather hand gesture and position information. The path a new butterfly follows would be determined by the trace of audience’s thumb root joint. When the audience is ready to create a new butterfly that would be following their hand trace, they would close their fingers to make a fist then open up the hand to release the butterfly. Doing so would write the path list to a file, which would then be read by Blender python at each frame to generate a new butterfly with an animation key points of the hand path locations. I limited the path list length so that Blender won’t freeze importing and creating large number of frame key points.

                                    (Recorded Trace Shown)

                         (Pipeline view of interaction above)

I made the butterfly model and used color ramp node in shader graph to randomize the color of the generated butterfly. I randomized the color with intention of giving a bit of diversity in the generated butterflies, but now that I’m looking back, perhaps I should have the butterflies to take the creator’s visual information as part of the generation so that each butterfly can be more customized to the creator. One idea of this would be having the butterfly wing texture to use creator’s hand image as a base color.

   

                                           (Installation Views)

As also mentioned in my project proposal, I chose butterfly as an object to be created because creation starts with one’s idea. Butterfly often symbolizes metamorphosis and soul/mind, which fits well to this project’s exploration of act of creation (bringing an idea to a physical space) and authorship (idea/concept). The collective colorful glow of the generated butterflies let the viewers to both appreciate the group of butterflies from far or identify the specific butterfly one has created based on the path it follows.

Reflection:

Despite my initial prediction that the viewers would be able to easily identify the butterfly they created based on the trace they drew by moving their hand, I realized this wasn’t the case especially if there are more butterflies flying around on screen. As I mentioned earlier, I wish I made the function to generate butterflies with more customization so that anyone can easily find their creation while also appreciating the overall harmony of their work being part of those of many others. Even though this project was a great opportunity for me to learn and explore mediapipe and hand tracking using open cv, I also realized I could’ve utilized the hand tracking data more than just gather a past locations of a specific point of the hand. If I work on this project again, I would also take the data of how people move their hands in addition to where they move their hands. I think exploring how would bring more interesting insights to people in time, as different people move their bodies in a different manner even if they’re prompted to do the same movement.

kitetale – two cut

My understanding of two cuts from the reading is that there’s one cut (opening cut) of setting the situation and another cut (closing cut) of recording the process.

In the upcoming person in time project, my opening cut would be the setup of projectors and leapmotion that invites people to present their hands in front of the projector to generate their own butterfly, and the closing cut would be the experience of seeing an animated butterfly flying towards the group and joins the collection.

kitetale – person in time project proposal

Inspiration:

As AI-generated art become more accessible (e.g. Dall E), I started to wonder what it means to create. If we use algorithm to generate an image, who is the ‘creator’ of the work? Would it be the AI or the developer or the user? Or could it be counted as collaboration across all three? Does who matter more than what? I wanted to shift the focus on the output/work instead of the author by making a space where anyone can easily create something beautiful and let their work be part of the many.

sketch:

Since oftentimes creations are made with hands, I want to capture the movements of hands and the act of creation & presentation of the creation. Motion of opening hands gives the sense of carefully presenting something small and important, so I’m thinking to use leapmotion to detect the motion of slowly opening palms to reveal what’s inside.

Upon opening one’s palm, a butterfly of randomly selected shape and color would come alive and start flipping its wings on the user’s palm, then fly towards the front to join the crowd of butterflies.

“The ancient Greek word for “butterfly” is akin to psyche: soul or mind.” Creation starts with one’s idea, and I thought the motion of creating and releasing and idea would be best portrayed with animating a butterfly that later joins the crowd of butterflies presented in front of the user. The collection of glowing colorful butterflies would represent how we can easily appreciate works by others regardless of the author, as well as contribute to it by doing a simple motion of opening palms.

kitetale – Typology Machine

Travel Over Time series captures a collection of passerby’s steps observed over time at a location from a low angle in one frame. Each sneakers in the frame once was in that location at some time.

How it started:

Curious of how the observation of the world differ from a viewpoint closer to the floor, I wanted to explore and capture the ‘common’ of the low perspective. I first recorded the small rover’s point of view as it rolls across the road. Not so long after recording, I realized that I made an assumption that all visuals captured would have automatic stabilization like human eyes: All recordings I have made with the phone mounted on a rover had extreme motion blurs, as the rover rolled very close to the floor and doesn’t have any shock absorption feature that would stabilize the camera.

Learnings from the first attempt: 

I was surprised to see the number of shoes and feet from the initial recordings. Sometimes the view captured was clear (like the snapshots above), and in these moments, there were shoes or animal feet in focus. Probably the camera autofocus algorithm in iPhone worked better when there’s a clear outline of a target object, but these few snapshots made me wonder how many different sneakers would the camera be able to capture from a low angle at a given location over a defined time.

Choices I made:

I chose sneakers because I personally like wearing sneakers the most and am more interested in seeing diverse designs of sneakers than those of other shoes type. Also, it is a great reference mark to indicate where people once were, as most of the time, everyone’s shoes are located on or near the floor regardless of other personal features like height. I also decided to remove other part of the bodies and belongings other than sneakers to bring attention to the passing of time and diversity in the location where people choose to walk on a given pathway.

Process:

Below describes the steps I initially planned to take to create Travel Over Time.

Typology Machine Workflow:

    1. Setup camera near the floor looking at the people path
    2. Record for however long you’d like to
    3. Extract each frame from the recordings and get the average visual of the frames using OpenCV  (Background extraction)
    4. Train ML model to detect and categorize different sneakers
      1. Scrape web to collect different sneakers images using Beautiful Soup
      2. Categorize sneakers into 6 different groups : athletic, design, high-top, low-top, slip-on, street
      3. Annotate each images to indicate where the sneakers are in the image
      4. Use Cascade Classifier in OpenCV and TensorFlow to create/train a model
    5. Extract sneakers from each frame using model
    6. Additively overlay extracted sneakers on its position detected on frame on the background image created earlier

Since this was my first experience using OpenCV and ML models, I couldn’t annotate all data in time to train my own sneakers detection model. I got the general object detection to work using pre-existing model online, but it was too general and wasn’t as accurate to use for the project. Also, I noticed that additively overlaying the shoes detected would not give the clean imagery of multiple shoes walking across the same location all at once, as the sneakers would be detected as a box and overlay could include pieces of background within the detected box.

Instead, I followed steps below to create Travel Over Time:

    1. Setup camera near the floor looking at the people path
    2. Record for however long you’d like to
    3. Extract each frame from the recordings and get the average visual of the frames using OpenCV  (Background extraction)
    4. Import frames that have sneakers into Photoshop, delete pixels that aren’t sneakers
    5. Repeat until frames filled with as many sneakers as you want

Evaluation & Reflection:

Last year, I have set myself a goal to learn new skills for each project I work on. I took this project as an opportunity to learn more about OpenCV and Machine Learning, which I’m happy of the knowledge I’ve acquired on them while investigating for this project.  Although I didn’t get to fully complete model training section as planned, I know I can use the image labeling service in the future to get all images annotated to feed as training data.

I’m also happy with how the series turned out since each image captures different places the passerby have chosen to step on as they walk through the space. It was also interesting to see different sneakers designs people wear on campus. In the future series, I could further expand to show the path traces of each shoes by adding the combination of shoe movement trajectory instead of individual frame of shoes.

kitetale – Typology Machine Proposal

I want to focus on the experience of viewing the world from a low angle, namely a small dog’s perspective. What objects are more visible or invisible from an angle closer to the floor? How do dogs find small wild living things like squirrels and bunnies more easily than us?

I’m planning to mount a camera on a rover, record the scenery as it rolls over different streets, and run computer vision on the recorded video to identify small animals and other living things that would excite a dog. Using a post-processing softwares, I would then highlight the living things identified by CV as the recorded video plays. (e.g. blur out the rest and scale up a little the bounding box area of identified object in the video) Another way I’m considering to present the data is to laser engrave/print different snapshots of the CV identified visuals on wood, potentially in a grid or as a collage on a selected scene background.

Other note : I haven’t used CV before, so this project would be a great opportunity for me to explore CV as a tool.

kitetale – SEM

I brought a piece of ear plug that I used at a woodshop once. Since it was much bigger than what the machine could take, I ripped a portion of it. The result was pretty surprising — I was expecting a visual of foam, perhaps something like a sponge since that is how ear plugs contract and expand. Upon a closer look, it actually looked very flaky, as if it was made out of multiple thin layers of something.

I also had a chance to look at others’ objects too, which was quite a learning experience. This is a pollen found on one of the bugs:

kitetale – Reading – 1

I’ve heard about how photography has been used in the past to capture the split seconds of moment using various mechanisms that are like the ancestors to the modern day camera before, but many of my understandings come from the perspective of film, projection, and illusion. It was interesting to me to see how photography as an individual snapshot can offer so much more than I thought as the starting point of information embodiment. In specific from the reading, I found how photogrammetric images have been used in crime scene analysis very interesting since photogrammetry to me has always been a method that stitches a ton of 2D images into a 3D model.

Even though it is obvious that photogrammetry uses grid and mathematical equations to determine the dimensions of the subject in a snapshot to gain spatial information, I haven’t really thought about how much information one photo with an additional photographer’s knowledge of where it was taken at what angle can offer. One lesson I definitely learned through this reading is that I don’t need millions of data to construct or learn about a subject – even a single image or two can potentially be enough to give me the wanted insights. I see so much artistic opportunities in analyzing a single image with computer vision and/or photogrammetry, which I hope to explore more this semester.

kitetale – Looking Outwards

The artist I want to share is Shane Fu, a motion designer/video creator who adds a new perspective and fun to a mundane city footage. Many of his works add imaginary space to the existing physical space by manipulating the footage and adding a small twist to it. Recalling our discussion on the assumptions we make for the camera/photos, his work definitely makes use of the illusion of 3D space on a 2D screen and people’s general belief in that photos/footages capture the ‘facts’ of the physical world — presenting awe to the audience who hasn’t challenged these assumptions of camera as much.

I find his works interesting because he utilizes not only visual effects/3d modeling tools to create his own imagination space, but also used object tracking to further persuade the viewers that this new space that he’s created exist as part of the physical world we inhabit. In a world where anything and everything can be easily recorded and shared online, not that many people seem to care as much about the truth of existence of what’s being presented to them on screen, but rather the entertainment or uniqueness aspects of what they’re viewing. Shane’s works satisfy these needs people show in the rapidly growing tech generation.

More of his projects on Instagram