The implementation of machine learning to capturing processes has opened a new field of self-driven cameras able to shot, edit, and compose photographs without our assistance. Although it can seem new, we, as ‘users’ either in photography or any other skillful task, have been bestowing autonomy to all of our tools and devices since the industrial revolution. That first concession is the ‘auto’ mode of our cameras that analyzes the exposure of the image to define specific camera settings. Cameras have become more intelligent over the years to produce more realistic, contrasted, and saturated images able to mimic our vision. Still, in that scenario, the willingness to shot and decide what is kept inside or outside the frame is ours, and that is why photography as a career still exists. In all this kind of photos, the authorship of the revealed product lies in the user. But as cameras evolved as autonomous devices, the responsibility for the style and content of the photo, and eventually, even the authorship could be transferred to the creator of the software and hardware of the camera. Perhaps, the future photographers would be curators of contextual frames and style books used eventually to generate countless exhibitions with pictures taken by communities of users equipped with intelligent cameras.
“Cameras that use ML have the potential to both automate existing functions of the camera as a tool for human use and extend its creative possibilities far beyond image capture.”
The relationship between a user and cameras that make choices on their own, changes the task of the user from one who captures to someone who creates the situation for the camera to capture. Machine learning cameras turn photographers to curators or installation artists. Authorship would still lay with the user since without their input there would be no system. Although these cameras can “do” more, the expectations of these cameras would be no different than developing film. A captured situation is still present, yet the type of outcome can not be absolutely guaranteed. Cameras, like Google Clips, or systems, like Pinterest’s Lens, are sifters. They go through information to choose what may be wanted. Taking a hundred photos and choosing a couple manually is the same process. The camera is still a tool unless it can form or embed itself in the situations in which it captures. The tools presented just take a familiar process and put it in an unfamiliar vessel.
The Clips camera system makes me pessimistic about human’s developing relationship with artificial intelligence. If photography is an expressive outlet, or a way to communicate perspective, what purpose does this system serve in terms of capturing humanity? We have all of these tools that let us take photos, which have been developed to be “easier” to use, but this only serves to eliminate the control that a person has over the tool (even though there is still control over where the camera is pointed, although they would probably eliminate that too if it was possible). I think a system like Clips also strengthens cognitive barriers hindering our ability to consider our senses from beyond themselves. It’s designed to deliver a product that most closely resembles how something would look if you saw it in person, so the user never has to consider the media as a result of real, physical processes, only a reflection of they expect. Not only that, but it decides what is significant content, preventing the user from having to consider what’s real that makes content significant. Thinking of the future, if everything we do is aided by artificial guides and standard parameters, what will we all be doing that’s so important to photograph beyond the state’s interest in surveillance?
The Pinterest Lens works in a similar way, and I don’t think it’s as offensive because it’s not a photography tool. It seems to serve more as a search engine that uses image data to associate concepts and things. In other words, this system would work similarly if seeing the photos was removed from the process altogether. You are pointing a device at an object or space, and finding posts about similar objects or spaces.
Bruce Sterling’s concept of future imaging as only computation is intriguing, and similarly unsettling. The implication is that this system of cloud computing is aware of the physical arrangement of matter, meaning whoever controls this system can observe any event taking place, no matter how private. It also means that they can observe people’s dreams, ideas, and feelings via the arrangement and activation of neurons in the brain. Similar things are already being implemented using machine learning to reconstruct a subject’s radiated brainwaves into the image that is being seen by the subject.
The text On Camera Transformed by Machine Learning introduces us into the new technologies transforming the way we conceive the photographic camera as a tool or device for image-making. As the article states, the camera has become more than a point-and-shoot object through the emergence and use of machine learning algorithms and software and the role of the operator has shifted from the photographic experience.
In this new situation, image-capturing devices (the camera as an object is not necessarily needed anymore) are capable to recognize and organize visual data, this is, can make decisions and to take a picture by themselves. The control of time, light, composition, and other variables rely absolutely on the device. This implies an existential shift in the relationship between device and operator: the boundaries of authorship become unclear as the photographer is no longer required for the image capture. The camera becomes an independent proxy of the desires to grasp reality through images. In this particular scenario, what does it mean to give agency to the camera? Can we consider it will evolve from an inert tool or extension of the eye to become an active collaborator?
While it is known that the idea of a camera without an operator is not new, what is mesmerizing is the idea of an intelligent device not only capable to recognize visual data and take pictures but to be a learning entity, able to make choices, compose and select visual information to recreate the way we understand the subjective experience of photographic and visual capture. As these systems gain experience through continuous learning and access to a vast field of visual information I wonder, what kind of images will they create as they acquire independence and intelligence in time? Though all this learning originates from explicit human programming, to what extent will these devices and systems influence our visual field with their own subjectivity?
In some of the speculative cameras described in the article, the user/operator and cameras/sensors relationship moves very much away from the traditional relationship that point-and-shoot cameras had between their user and camera. One way in which this seems to be happening is along the lines of agency of the situation. In more traditional photo camera’s the agency of the capture is at the behest of a user’s specific action and choice to engage with the camera – simply having access to the device does not allow for a capture to occur. In some of the situations discussed by Ervin, the question of agency changes to one where the action switches over to one where the agency is at the beginning of the situation where the terms for a potentiality of photographic situations to occur are created. If one has a camera that can take pictures on its own, learn from its actions, an is always operating then the last opportunity for the agency of the capture (in the 20th century sense of the photographic capture) occur at the onset instead of the moment of the performative ‘click’ of the camera. So here we have a situation where the device is always performing at the user’s initial request, and performing on its own in a way that the used to require a union of performative relations between person and machine.
This question of the performative agency happening at the beginning of the situation opens up larger questions about the relationship to agency amidst the user/operator and cameras/sensors relationship as cameras out in public begin to take on the computational qualities described by Ervin. What happens to the relationship when the consent of situational agency is removed by increased proliferation of these devices in an unregulated manner that arises out of the conditions of America’s late capitalism? When will this capture performance start? What is the Amazon Ring gonna do to us (performatively)?
The Camera, Transformed by Machine Learning
The act of taking a photo is a very personal, intimate thing. It is a mechanical representation of your perspective, adjusted and dialed in to best record what you are seeing. It is an act of labor that extends past the body in order to capture that moment in time. The camera is both tool and partner in the creation of the image, and often allows the user to extend their vision in ways that biology cannot (zoom, exposure, depth of field, etc.). With cameras becoming more autonomous, I believe that relationship between user and tool remains intact. Perhaps it has moved into more of a platonic partnership rather than an intimate romance, but the authorship remains the same. Artworks are credited to the user and their materials. The more autonomous the imaging system the more dependency and trust the user has to place in the system. As machine learning advances, we may have to credit these systems as full fledged co-authors.
This article makes clear that traditional notions of the camera break down when the machine is given “its own basic intelligence, agency, and access to information.” The labor divide between photographer and camera, human and machine becomes blurred and inconsistent. Examples such as GoogleClips and automatic image manipulation technologies suggests that taking photos or using a camera becomes a more scaffolded activity. It may seem that the increasing ‘agency’ of the camera results in a reduced sense of agency for the user of the camera — the machine is doing more, and the human is doing less — but is it also possible that the singular authorship associated with our notion of the camera is a myth? The agency of the operator — perhaps the individual who presses the button — might be more limited, but there are other people behind each camera — users, designers, engineers, scientists, business interests — who decide which realities are chosen and how they are captured and rendered. Any system of machine intelligence follows policies, and most of the times the policies are defined by human beings. Someone had to define what a “well-composed candid picture” means when designing GoogleClips. Portrait filters are based on transient standards of beauty.
Maybe, the labor model of the camera today should be one that recognizes the multiplicity of authorship involved in creating an image. Instead of emphasizing giving agency to an individual user of the camera, what would happen if we begin to emphasize a more transparent, and collaborative relationship between the multiple decision-makers behind the capture and creation of images?
The question of authorship, especially in the realm of new technology, programming, and art, has become a fairly complicated one, and one that I am still not entirely sure of my stance on. In the context of programming computer generated visuals and images my base stance/understanding has revolved around the idea that: if one writes a program, hacks a technology, or re-contextualizes a preexisting object/tool/etc to create something, then authorship goes to the creator/hacker/re-contextualizer of the program due to the choices (or unintentional findings) made, which forms the “art”.
This topic of authorship is however much further complicated when looked at in the context of new cameras discussed in The Camera, Transformed by the Vidion Machine. In this new world of the removed capture button, and cameras and programs with their own agency, I think that place in which “art” occurs, is transferred to the relationship between the subjects interaction with the lens/capture, rather than in the moment of capture itself.
When the camera holds the agency the subject becomes something between an actor and someone unintentionally walking through a photo being taken. In the context of the actor, there is space for agency in the new interaction that they present in front of the lens, however in the case of the passerby the lack of agency removes them from a context that I might consider to be art: they may appear in the photo, but have no hold over authorship.
As any task becomes automated by machines, the relationship between operator and machine changes. For example, sewing clothes once had to be done entirely manually, then more effectively with mechanical sewing machines, and now one person can oversee a massive factory machine mass producing items of clothing. I don’t think these new advancements in cameras are much different than that, though they may feel that way since photography is generally considered a creative act. Many of the boring or difficult parts of photography (like snapping the picture at just the right moment or editing a face to look more beautiful) can now be done with machine learning, which frees up people to do more of the interesting stuff, like choosing what to point the camera at. Even if this itself becomes automated (as in some ways it has), then the “art” just shifts to be something different, like choosing an image from a set, or deciding how to print and display a photo. The person responsible is still the author of the work, but what that means can vary depending on what exactly they did. This is nothing new, either. Even without ML, some photos are carefully arranged and lit in a studio by the photographer, while others are taken candidly “in the field.” Both are artwork, but the art-making act is different. So if a person creates a photograph by setting a smart camera in a certain location and waiting for it to snap a picture, then that decision itself is their art, and that’s how audiences will think about it. It doesn’t make them less of an author, and I don’t think it radically changes the notion of a “camera” either.