Training GAN with runwayML


    • Process:
      1. Instagram image scrape using selenium and bs4.
        • This part took me the most time. Instagram makes it really difficult to scrape, I had to make several test dummy accounts to avoid having my own account banned by Instagram.
        • Useful Tutorial:
      2. Image cleanup, crop(square) and resize(640*640px)
        • After I scrapped this ig account’s entire page, I manually cleaned up the dataset to get rid of any images with face, more than two hands or no hands at all. And the remaining 1263 pictures are cropped and resized to be better processed by runwayML.
      3. Feed dataset to runwayML and let it do its magic.
    • Result: Link to google folder Link to runwayML Model
      • When looking at the generated images zoomed out, thy defiantly share a visual proximity with the ig:subwayhands images.



However, looking at the enlarged generated images, it’s clearly not quite at the level of actual hands. You can vaguely make out where the hand will be and what’s the posture of the hand/hands.

My Hair

    • process:
      1. scan my hairs:
        • I had this notebook full of my hairs that I collected throughout my freshmen year. I scanned them in to unified(1180px*1180px) squares, and at the end I was able to get an image dataset with 123 images of my own lost hairs.
      2. Feed dataset to runwayML and let it do its magic.
    • Result: Link to google folder Link to runwayML model
      • Despite the small size of this dataset, the generated images exceeded my expectations and seems to more successfully resemble an actual picture of shower drain hair piece. This is probably because the hair dataset is much more regularized compared to the subway hands dataset.  The imagery is also much simpler.


  • Technical Implementation:
    • Instagram Scraping Tools/Tutorials
      • Apify – Instagram Scraper:  I played around with this tool for a bit. Overall, it’s really easy to use. Pros: fast, easy, I don’t need to do much other than inputting the Instagram page link. Cons: It crapes a lot of information, some of which are irrelevant to me. This requires me to filter/clean up the output file.  And is doesn’t directly output the pictures, instead, it provides the display URL for each image, and the URL expires after a certain amount of time.
      • GitHub – Instagram scraper :
        • There is a youtube tutorial showing how to use this.
        •  I simply could not get this one to work on my laptop, though it sees to work for others. Need to play around with this a bit more.
      • Python library  – BeautifulSoup:
      • Alternative method: Manually screenshot pictures
        • Although this is boring, redundant work, it gives me more control on what pictures I want to use to train my GAN.
    • Pix2Pix: If time allows, I want to look into using webcam to record hand posture and generate images based on realtime camera input.


I want to make a tool-centric final project, experimenting with training a GAN with images scraped from this ig account:subwayhands.  This was originally an idea for the body tracking project, but I was unable to execute it for that project, so I want to continue with it for the final project.

I think the main reason for wanting to do this project is that after COVID , I became more conscious and paranoid about nyc subways.  While I was in nyc this Christmas I tried to avoid taking the subways, mainly because of health and other safety concerns.  But I still think that nyc subway has its own unique vibes and holds lots of stories. I have followed the Instagram account: subwayhands for a while now, each picture holds a lot of imaginative potentials.

The main challenges for my proposed project are scraping instagram for enough photos to train the GAN, and  whether its possible to make it similar to pix2pix, using webcam captured picture of my hands as the guiding input to generate an nyc subway hand picture.

project sketch: 


Share puzzle image album: 

Ally and I worked together on this project. We first started with the idea of a collaborated image collage space where users can upload, edit and place their image pieces on a shared canvas. But we struggled with the freehand cropping feature, which is kinda crucial to collaging.

We then pivoted our idea to make a collaborative puzzle space, where different users can upload images via URL and the image will be processed to look like a puzzle piece. The reason for choosing a puzzle shape is that it has a well-perceived affordance of being picked up and moved around. And the goal wasn’t really to complete a cohesive big picture, but more about forming arbitrary and somewhat random subgroups of connections from different users’ pieces.  However, this depends a lot on the user being able to move each piece, which is not supported in this current version. This really subtracted from our goal and made the project less interactive/interesting and more like an online album.  It defiantly needs a lot more work moving forward.

challenges/next step:

    • image information is stored in a way that’s harder to retrieve and update.
    • each piece needs to be movable
    • not much freedom on the user’s side; no further edits to the picture
    • UI can use more refinements
    • some urls cannot be recognized(blank puzzle piece), depends on where the url is gotten from


Domestic Tension by Wafaa Bilal, 2007

This is a networked durational performance. The artist confined himself to a gallery space and broadcasted himself to the internet. The audience can view, chat with him, and is also given the power to control the pinball gun to shit him. Through this work, Bilal intended to highlight the violence and racism of US culture after 9/11.

This work is different from many other net arts in that it’s using ‘network’ to portray violence, disconnect, contradiction and the BAD. This work reminds me a lot about Marina Abramovic’s Rhythm 0, 1974. In both works, the artists themselves serve as the medium/net to provoke audience interaction and reveal the underlying BAD. 


I thought this one is hilarious. People are intentionally using the flaws of the algorithm to their advantage.

2. The privacy concern of using facial recognition is always a heated debate. Last summer, I was flying from Beijing and there was this new kiosk at the airport where you can simply scan your face and it tells you your flight information right away. At first I was amazed at its convenience because I don’t need to pull out my boarding passes. But then it struck me that I never scanned my face when I was checking in nor when I was purchasing the ticket online. I’m not exactly sure when did the system logged my face in and I started to be more concerned about facial recognition and where it’s harvesting the data from.


Prompt: Rick and Mort messing in the white house.

I actually like the image produced around the 50th iteration the most. It resembles the color/feeling of a Rick and Mort scene the most. I am mostly interested in seeing if it will successfully recreate the style of Rick and Morty. It failed in the sense that it’s becoming more and more realistic, and loosing the sense of color and flatness.



I Feel Like My Life Is Just A Series Of Unrelated Wacky Adventures.

My family and I were watching Once Upon a Time the other night (I haven’t seen it in years.
I wasn’t surprised when it made it to ABC).
It was about time to head to bed.
My sister Kori had been trying to use a lipstick without a mirror, but in the dark, she hadn’t seen it.
All she had seen was her hand.
And all she could remember was seeing her hand.
That’s right.
Narrative Device:
Both results read naturally. However I defiantly think the Inferkit paragraph is more suited to the prompt, whereas the narrative device paragraph is completely different from what I’m expecting.


It’s actually really hard to imagine what the outcome will be just based on the sliders. The resulted images aren’t really what I was  expecting, but they do have somewhat distinguishable features from the genes.


Above is two examples I made with edges2shoes. I notice that it works best it you draw realistically with perspective(image 1). Whereas a highly stylish, 2D drawing(image 2) doesn’t render as well.