Computer Vision

Extraction of selected data and application of metrics

Toby Walsh interview, interpreted by machine vision software. Video by Rune Saugmann.

Even after the ‘capture’ performed by digital cameras, visual data is not given. Instead it is re-made, selected, processed and formatted to be readable and usable to computer vision software. The majority of the ‘data’ captured is discarded in this process, while other parts are synthesised and augmented even before the (necessarily political) metrics that try to capture something about the social world are applied to it.

How are digital images turned into security knowledge, and what does this ‘turning into’ do to the subjects of (in)security? The short answer to this is that most often, it does so as being regarded as evidence of a world external to the digital imaging apparatus, as evidence of what happened somewhere. But technologies relying on computer vision exceed this simple equivalence, even if they also build on it. For this reason I am interested in the processes that take place between the recruitment of a digital image into such a system and the report of an output metric, whether this is a score evaluating a likely match against a database, a probabilistic identification of either commonalities in image composition (scene detection) or presence of some category of objects in the image (object detection), or some other output metric.

Being interested in these processes is difficult for a social scientist, but I think that these behind-the-scenes digital transformations is where a lot of the social changes coming with the digital society are located. We’ve all seen that with how social media has transformed debate, in ways that were unforeseen also to its creators, and I think we’ll see changes on the same scale when our understanding of the social world is mediated by recognition technologies. I don’t know exactly how, but this short video I made tries to make the gap apparent. The video shows recognition technologies applied to my interview of activist academics involved in debating the role that AI and recognition technologies should play in war and weapons systems.

It’s clear in the video that the recognition system understands radically different properties of the social that what the average viewer would do. They’re not wrong, just ill-suited to some understandings of the social situation, while suited to others.

These texts think about technological forms of seeing and how they relate to security and violence.

Read online

Saugmann, R. (2018). The Art of Questioning Lethal Vision: Mosse’s Infra and Militarized Machine Vision. [Unpublished manuscript]

Read online

Saugmann, R., Möller, F., & Bellmer, R. (2020). Seeing like a surveillance agency? Sensor realism as aesthetic critique of visual data governance. Information, Communication & Society, 23(14), 1996–2013. https://doi.org/10.1080/1369118X.2020.1770315

The implementation of computer vision-based security governance is often shrouded in secrecy and sloppy oversight. Sara Pynttäri, who was my intern in 2023 did a short report on such technologies in Finland, Recognition technologies in Finnish Public Order.

Back to
Computer Vision