Logo image
Multimodal generative AI: remodelling the sensorium.
Journal article   Peer reviewed

Multimodal generative AI: remodelling the sensorium.

Erika Kerruish
The Senses and Society, Vol.First online
15/01/2026

Metrics

21 Record Views

Abstract

Generative AI multimodal image Stiegler text-to-image language the senses
Multimodal generative artificial intelligence (GenAI) intervenes in the technologically, historically, and culturally produced sensorium. With large-scale models centering on vast non-sensual datasets of image-text pairs and, to a lesser extent, audio-text pairing, the sensory hierarchies of digitized media are intensified and reconfigured. As a sense image-making practice, the sense modes coalesce and separate in new ways in non-sensuous neural networked processing, which is entrenched in culturally and socially entrenched and layered practices of digitization. In its current form, multimodal GenAI sidelines creative metaphorical meaning-making’s role in communicating sense perception and creating significant worlds. The predominant practice of identifying sense-images and text in datasets introduces what can be understood as a “dumb synaesthesia” that cements relationships between text and sense images, limiting our ability to trace or decide how they generate pictures and sound, regardless of our technical skill. Curbing the role of self-reflective thought in the production of sensory images alters the relationship of individuals and collectives to images, with cultural and political implications.

Details

Logo image