George Poonkhin Khut's sensory artwork, Distillery: Waveforming 2012, was the winner of the 2012 National New Media Art Award. This immersive installation artwork is a biofeedback, controlled interactive that utilises the prototype iPad application 'BrightHearts'. Khut has an interest in the continued development of the 'BrightHearts' app to the point of making it available as a download from iTunes App Store to be used in conjunction with specialised pulse-sensing hardware.  The configuration of Distillery: Waveforming presented in 2012 at the Gallery of Modern Art, Brisbane, incorporated Apple iPad 4th generation devices running the 'BrightHearts' app supported by Mac mini computers that processed data and mapped sound and visuals that were fed back to users as animations on the iPads. At the conclusion of the exhibition the artwork was acquired into the Queensland Art Gallery collection.  The Curator of Contemporary Australian Art requested that the acquisition ensure that the artwork was captured in perpetuity in its prototype state.  The iPad devices underwent jailbreaks to safeguard their independent operation and management, and to allow for the permanent installation of non-expiring copies of the 'BrightHearts' app.  Source code for the 'BrightHearts' app was also archived into the collection. This paper describes the development of the artwork and the issues that were addressed in the acquisition and archiving of an iPad artwork


Figure 1. George Poonkhin Khut, Australia b.1969, Distillery: Waveforming 2012, Custom software and custom heart rate monitor on iPad and Mac mini signal analysis software: Angelo Fraietta and Tuan M Vu; visual effects software: Jason McDermott, Greg Turner; electronics and design: Frank Maguire; video portraits: Julia Charles, Installed dimensions variable, The National New Media Art Award 2012. Purchased 2012 with funds from the Queensland Government. Image: Mark Sherwood

George Poonhkin Khut’s digital artwork Distillery: Waveforming is a body-focused, controlled, interactive experience. The artwork was acquired by the Queensland Art Gallery / Gallery of Modern Art (QAGOMA) in 2012 and has been the subject of an ongoing dialogue between the artist and the Gallery, through the Head of Conservation and Registration, regarding its long-term preservation.  At the heart of the artwork is an individual, human experience with certain intrinsic elements combining to create this experience. In their endeavour to provide a sound future plan for Distillery: Waveforming they have questioned ‘the experience’ from their individual perspectives – that of the artist and the collecting institution.

Distillery: Waveforming is both an independent artwork and an affiliated outcome of Khut’s long running work with heart rate biofeedback. This unusual duality plays a significant role in the ways in which the artist and the institution perceive the artwork, its preservation and future installations. Since the artwork’s acquisition into the QAGOMA collection the artist has remained involved and interested in the Gallery’s management of Distillery: Waveforming. Khut’s progress in his work on the biofeedback project has seen him make significant advances in software development, allowing him to release the iTunes application BrightHearts that was in-development at the time that Distillery: Waveforming was created. These advances in the biofeedback project provide current context to the dialogue and continue to shape the opinions of both artist and institution. Through this collaborative process QAGOMA has been able to build an extensive resource for the long-term preservation of Distillery: Waveforming.


George Poonhkin Khut’s biofeedback artwork Distillery: Waveforming was the winning entry in the 2012 National New Media Award (NNMA) held at the Gallery of Modern Art (QAGOMA 2012). The artwork entered the Queensland Art Gallery / Gallery of Modern Art collection at the conclusion of the exhibition. The curator of Contemporary Australian Art requested that the artwork be acquired to accurately reflect its display in the NNMA exhibition – that is as a prototype.

In 2011 when Khut was invited to enter the NNMA he was working as the Artist in Residence at the Children’s Hospital Westmead. In this residency Khut and his research colleagues commenced the BrightHearts Project that aimed ‘to assess the potential of small, portable biofeedback-based interactive artworks to mediate the perception and performance of the body in paediatric care: as experienced by children undergoing painful recurrent procedures’ (Khut 2011).

Apple iPads loaded with games were already in use for diversion and distraction purposes during painful procedures at the Children’s Hospital Westmead. Khut chose to adapt his work for iPad technology for the BrightHearts Project based on this ‘diversional’ precedent and the excellent optical qualities of the iPad display (Khut 2014). In realising Distillery: Waveforming Khut channelled years of artistic practice in biofeedback and body-focused interactivity in the development of a cross-disciplinary artwork at the core of which was the prototype BrightHearts application (app) for Apple iPad.

When Distillery: Waveforming was displayed in the NNMA exhibition, from August to November 2012, the BrightHearts app was still in-development under a short-term Apple Developer licence. At this point in the provisioning, the prototype app generated the visuals on the iPad in response to a multilayered array of messages transmitted from a laptop or desktop computer over a network connection. This approach enable Khut to quickly prototype a variety of visualisation ideas by adjusting parameters on the desktop computer, without needing to compile and install the app on to the iPad each time. More importantly, at the time of its development – this networked approach also enabled him to incorporate live heart rate sensor data in a way that was not supported by the Apple operating system (iOS) at the time (before the introduction of the Bluetooth 4.0 wireless standard), and to continue his work with complex signal analysis, mapping and sonification algorithms that have been central to his work with body-focussed interactions since 2003. Essentially Distillery: Waveforming and the trial therapeutic devices at the Children’s Hospital Westmead were operating as ensembles that included iPads loaded with the prototype BrightHearts app, data collection devices, and desktop/laptop computers and network routing systems.


Acquiring Distillery: Waveforming to reflect its status as a prototype was a curatorial imperative. Khut describes his approach to the long-running biofeedback project as ‘iterative’ and in this regard the artwork is an incremental representation of Khut’s artistic practice and a model demonstration of the developmental BrightHearts app for touch screen devices (Khut and Muller 2005). In the future Distillery: Waveforming will become a legacy artwork intrinsically linked to past and future iterations in the biofeedback project.

Distillery: Waveforming derives from Khut’s earlier work on BrightHearts that commenced in 2011 and his Cardiomorphologies series from 2004-2007. The mandala-like visuals were initially developed for Cardiomorphologies v.1 by John Tonkin using Java programming which Khut controlled via Cycling’74’s Max (version 4.5) application, a popular visual programming language for Apple and Windows computers. In 2005 the original visualisation software was expanded upon by Greg Turner for Cardiomorphologies v.2 using visuals generated from within the Max application. Turner used the C++ programming language to develop ‘Fireball’ a specialised graphic module (known in the Max programing environment as an ‘object’) – that enabled Khut to control the visuals with messages to each layer, for example, drawing a red coloured ring, the width of the screen, with a thickness of 20 pixels, and a green circle with a gradient, with a diameter of 120 pixels (Khut 2014; Pagliarino 2015, pp. 68-69).

Then in 2011 Jason McDermott, a multi-disciplinary designer working in the area of information visualisation and architecture, was engaged to re-write Greg Turner’s ‘Fireball’ visualisation software to enable it to run on hand-held technologies with touch sensitive controls. Using the open source C++ library openFrameworks with Apple’s Xcode (version 4) McDermott redesigned and expanded the potential of the software, developing BrightHearts as an iOS 5 mobile operating system application (McDermott 2013).

Development of the app continued when Khut received the NNMA prize worth AUS$75,000 and in April 2014 the BrightHearts app was released into the iTunes store. Heart rate data acquisition and processing is now integrated into the application software and the only external device that is required in conjunction with the app is a Bluetooth 4.0 heart rate monitor that captures the real-time heart rate data. The app is categorised as a Health and Fitness product that can be used to assist with relaxation and body awareness (iTunes 2014).

This history of development, change, modification and repurposing creates a landscape in which Distillery: Waveforming is an important new media artwork. As a legacy artwork the Gallery aims to maintain the component parts and software in their original form and function for as long as possible. Technologies change at such a rapid rate that the artwork will date in the years to come to reflect, quite evidently, an artwork of 2012.  Perhaps future users will consider what are at present beautifully rich and transcendent animations as rudimentary and the touch screen navigation amusing and unsophisticated. Perhaps future users will recapture the sense of appeal that early touch screen devices inspired in consumers. However, it is not the intention of the Gallery to create a sense of nostalgia but to offer insights into the balance between art, technology and science at this fixed point in time.  As a legacy the artwork will be an authentic installation and will offer an unambiguous window into Khut’s interdisciplinary artistic practice.

In its presentation in the NNMA exhibition Distillery: Waveforming was configured of five iPad devices running the prototype BrightHearts app that were built into a long, shallow, tilled table at which participants sat on low stools to interact with the artwork. Specifications set by the artist allow the Gallery to modify the configuration for smaller displays of no fewer than three stations in future installations. However it is necessary that the ambiance of the installation space affect a sense of calm and contemplation by utilising low light levels, soft dark colours and discrete use of technology. In the original installation the spatial arrangement situated participants in front of three video portraits of the artwork in use (Fig 1). Distillery: Waveforming is a composite artwork incorporating the iPad devices loaded with the prototype BrightHearts app, external data collection and processing equipment and video portraits displayed on monitors. The combined hardware and software systems include:

  • Five Apple iPads (3rd generation) operating on the iOS 5.1.1 operating system, with retina display high resolution (2,048 x 1,536 pixels at 264 ppi) and dual-core Apple A5X chip

Loaded with:

  • BrightHearts app (in-development)
  • Cydia – a software application that enables the user to search for and install applications on jailbroken iOS devices
  • Activator app – a jailbreak application launcher for mobile devices
  • IncarcerApp – an application that disables the home button and effectively locks on the BrightHearts app when in use, preventing the user from inadvertently exiting the app
  • Five heart rate sensors incorporating Nonin PureSat Pulse Oximeters (ear clip type) sensors, a Nonin OEM III Pulse Oximetry circuits and Aduino Pro Mini 328 microcomputers, running specially written code (OemPulseFrank.pde) to receive the pulse data from the pulse oximeter sensors – and relay this to the MacMini’s via a USB-serial connection.
  • Five Mac minis 5.2, 2.5 GHz dual-core Intel Core i5 processor, 4GB RAM, 10.7.5 (OSX Lion) operating system


  • Max6 application (Cycling74, 2012)
  • Custom written scripts running from OSX ‘Terminal’ utility, that receive pulse data from the sensors via USB port and pass this along to Max6
  • One 5.0 GHz network router that transmits control data from the Max6 software on the Mac minis to the corresponding iPads.
  • Three digital portraits displayed on 40” LCD / LED monitors hung in portrait orientation
  • Video portrait files include MPEG-4, QuickTime ProRes and AVC file formats
  • Five headband-style stereo headsets

The prototype BrightHearts app for Distillery: Waveforming was written for Apple iPad (3rd gen) models running iOS 5.1.1. It was written under a short-term Apple Developer licence that allowed for provisioning and testing of the app on multiple devices. The licencing arrangement for the app expired in July 2013, nine months after the artwork was acquired by into the collection. A key aspect of archiving this artwork was the need to gain control of the app and in advance of the expiration the Gallery implemented an archiving strategy that was developed through consultation between the conservator, curator and the artist who was in contact with the software developer.

The most challenging aspect of the acquisition for the collecting institution was the long-term management of the proprietary technology and software. At the time of acquisition the prototype BrightHearts app was capable of performing a function with external support but did not have status as an independent Apple-approved application. In fact, its completion and approval was still one and half years away. It was also important for the iPad operating system to be locked down to iOS 5.1.1 as the prototype BrightHearts app for Distillery: Waveforming will only launch in this version. Through a consultative process it was agreed that to administer the artwork as an authentic prototype it was necessary to increase the end user control of the technology and software.  This was achieved through jailbreaking the iPad devices and loading a non-expiring copy of the prototype BrightHearts app on the iPads (Pagliarino 2015).


Distillery: Waveforming has been acquired with the intention of maintaining authenticity and as such the Gallery has archived a full complement of digital files for the artwork. Included in this is source code for compiling the BrightHearts application with Xcode and source code for compiling the pulse-sensing Arduino microcontrollers for which Khut owns both copyrights (Khut 2012).

In conventional object-oriented programming source code, a programming sequence in readable text, outlines the steps that are necessary to compile software and make it function as intended, for example an app for an iPad. The source code has to be interpreted or compiled by a programmer in order to create the necessary machine code, for example Xcode if the work is developed for Apple OSX. Acquiring source code is thought to be a means of future-proofing digital artworks (Collin and Perrin 2013, p.52). This is undeniable as without the source code there is very little that can be used as a structural guide. However, Laforet et al. (2010, p.27) questions whether source code can really act as a safety net for software artworks in an age where there is a strong commercial imperative driving the development of digital technologies at the expense of the conservation of data.  The success of source code to future-proof artworks relies on accurate interpretation and, in the context of an authentic experience, a complete lack of bias towards alternate or more efficient ways of programming the software to run an artwork as it was intended.

In cases where an artwork was developed using a suite of applications and programming languages, documenting source code becomes a complicated task in comparison to artworks where the source code is contained entirely within a single object-oriented programming environment. The programing for Distillery: Waveforming is distributed across three operating systems and four programming languages: Arduino for the sensor hardware; Objective C and iOS5 (via Xcode 4) for the BrightHearts app; OSX for the desktop computer that operates as a terminal emulator, running sensor data routing and analysis processes; and most significantly  Max,  the visual programing application that is used to perform the core analysis, mapping and sonification processes between the incoming heart rate data and the outgoing messages controlling the appearance of the various layers of the iPad visuals and sounds. Laforet et al. confirms that the difficulties faced with software artworks created by individual programmers are that:

These projects are relatively small efforts, putting the work created with it in a very fragile position. Unlike more popular software and languages, they are not backed up by an industry or a community demanding stability. The software only works under very specific conditions at a very specific time. Migrating such a work is a tremendous task, likely to involve the porting of a jungle of obscure libraries and frameworks. (Laforet et al. 2010, p. 29)

The complexity of combining multiple source codes from various programming platforms to work within one artwork significantly increases the risk of error in interpretation. In the case of Distillery: Waveforming it seems highly unlikely that source code alone would be sufficient to recreate the artwork in future. Khut has recognised this and has considered alternate bespoke and existing documentation systems for both Distillery: Waveforming and BrighHearts for the purpose of preservation and representation.

Visually the prototype BrightHearts app consists of 22 individually controlled graphic layers. Each layer is comprised of a single polygon that can be drawn as a solid shape or a ring, the edges of the shape can be blurred and the colour can be varied according to hue, saturation, alpha & value (brightness). The layers are then blended using an ‘additive’ compositing process, so that the layers interact with one another, for example a combination of overlapping red, green and blue shapes would produce white. This additive blending is a crucial aspect of the work’s visual aesthetic.

While the visuals are rendered on the iPad by the app developed by Jason McDermott, using Xcode and the openFrameworks libraries, the actual moment-by-moment instructions regarding what shapes are to be drawn, colour, size brightness etc. are all sent from the Max document.

The Max document, the top level ‘patch’ as it is referred to in the Max programming environment, is the heart of the work: the primary mediating layer between the sensor and display hardware that determines how changes in heart rate will control the appearance and sound of the work. It consists of an input section that receives sensor data, an analysis section that generates statistics from the heart rate measurements, and mapping layers that map these statistics to the various audio and graphic variables of colour, shape, volume etc.

The modular design methods used in the Max programming environment allow for the creation of modular units of code referred to as ‘abstractions’ and ‘bpatchers’, that can be re-used in multiple instances to process many variables using a very simple set of instructions. The programming for Distillery: Waveforming makes extensive use of these modules, which are stored as discreet ‘.maxpat’ files within the Max folder on the Mac mini computer. These modules are used for many of the repetitive statistical processes used to analyse the participant’s heart rate, as well as the mappings used to create the highly layered visuals and sounds that are central to the aesthetic of Distillery: Waveforming.

In the analysis section of the programming, changes in average heart rate are calculated over different time frames: the average rate of the last four heart beats, the average rate of the last sixteen heart beats, then thirty-two heart beats and so on, as well as information about the direction of these changes, enabling the work to track when the participant’s heart rate is starting to increase or decrease.

Within Max, the twenty-two graphic layers of visuals used in the prototype BrightHearts app, are each controlled by a corresponding ‘bpatcher’ layer-control module. Each of these ‘bpatcher’ modules contain 107 variables that determine how the parameters of all the layers are controlled. That is what aspect of the participant’s heart rate patterning it responds to and how these changes are mapped to variables such as diameter and colour of the layer in question.

Each layer-control module is comprised of sixteen sub-modules responsible for specific aspects of each layer’s appearance such as diameter, hue, saturation, and shape-type. In the programming of the layer-control modules the boxes of numbers visible in each module describe how incoming data relating to heart rate is mapped to the behaviour of the layer, in this case its diameter, and what statistical information it will respond to such as a running average of the last thirty-two heart beats, a normalised and interpolated waveform representing breath-related variations in heart rate, or the pulse of each heartbeat (Fig 2).

Figure 2: Four of the twenty-two layer-control mapping modules in Max – used to control the shapes drawn on the iPad by the BrightHearts (prototype) app.

All of these variables, controlling the appearance of each layer, are stored and recalled using a table of preset values describing which statistics each layer and variable responds to and how it interprets this input. These numbers are adjusted by the artist to produce the desired mapping and dynamic range and then stored in the .json file and recalled as presets. The information contained in this table is stored as a ‘preset file’ in a .json xml format file that is read when the Max document is launched. These preset files document the precise mapping and scaling settings that determine the appearance and behaviour of each layer of the artwork. Together these layered behaviours and the preset values that describe them produce the final interactive visual aesthetic of the artwork.

Figure 3: Example of one section of the .json ‘preset’ file containing preset data that is read by each of the graphics mapping modules – in this example showing all the parameters used to control the behaviour of the diameter  for Layer 15.


For the artist, these preset tables are of central importance for documenting the appearance and interactive behaviour of the artwork for future interpretations, since it is these values that determine how the work responds to changes in the participant’s heart rate.

Strategies for hardware independent migration and reinterpretation

Khut has begun the process of documenting and describing the interactive principles and behaviour of the artwork independent from current technologies to enable the work to be recreated in the future, based on the Variable Media Network approaches set out by Ippolito (2003a).

For creators working in ephemeral formats who want posterity to experience their work more directly than through second-hand documentation or anecdote, the variable media paradigm encourages creators to define their work independently from medium so that the work can be translated once its current medium is obsolete.
This requires creators to envision acceptable forms their work might take in new mediums, and to pass on guidelines for recasting work in a new form once the original has expired.

Variable Media Network, Definition – Ippolito, 2003b

For Khut, the essence of the artwork that would need to be preserved and recreated, independent of the specific technologies currently used, is the experience of having one’s breathing, nervous system, pulse and heart rate patterning represented in real time in an interactive audio visual experience, and the various optical and kinaesthetic sensations and correlations that are experienced during this interaction.

Taking an experience-centred approach it is not the source code as much as the experience of the visuals and sounds changing in response to the live heart rate data that is most essential to recreating the artwork. The aesthetic experience of interacting with the artwork, and the maner in which it responds to changes in heart rate initiated through slow breathing and relaxation is crucial to its authenticity.

The schematic approach: an open ‘score’ for reinterpretation

The simplest approach to documentation for future reinterpretation is the use of a very flexible set of instructions outlining the core interactive form and behaviour of the artwork. This approach leaves many aspects of the artwork’s appearance open to interpretation. Essentially what is preserved is the basic nature of the transformation – from breath, pulse and nervous system to colour, diameter, shape and sound. Such an approach would comprise the following instructions:

The visuals and sounds have been designed to respond to two forms of interaction:
1) gradual decreases in heart rate caused by a general increase in the participants ‘parasympathetic’ nervous system activity that can be initiated through conscious relaxation of muscles in the face, neck, shoulders and arms,

2) breath-related variations in heart rate known as ‘respiratory sinus arrhythmia’ whereby slow inhalation causes an increase in heart rate, and slow exhalation causes a decrease in heart rate.

The result being a wave-like (sine) oscillation in heart rate to which the work owes its name (wave forming).


Features extracted from Participant’s pulse and heart rate Name of modulation source (controling the sounds/visuals) Visual representation on tablet surface Sonic representation as heard through headphones
Pulsing heart beat /beat/bang Gently throbbing circular shapes that either contract subtly with each pulse, or darken slightly with each pulse – to create a visual effect of subtle pulsing. A deep and soft throbbing noise that gets louder and brighter as heart rate increases, and softer as heart rate decreases.
Breath-related variations in heart rate – normalised and rescaled to emphasise slow, wave-like oscillations in heart rate that can be induced through recurrent slow breathing at around 6 breaths per minute. /IBI/dev-mean/4/normalised Ring-shaped layers that expand when heart rate is increasing, and contract when heart rate is decreasing. Synthesized drone sound, modulated with a ‘phasor’ effect controlled by breath-related changes in heart rate.
Gradual changes in average heart rate (average of last 32 beats) mediated by changes in autonomic nervous system (stress/relaxation), neck, shoulder arm muscle relaxation etc. /IBI/how-slow/32 Colour of background gradient – red for fastest heart rates recorded since start of session, green for medium, and blue for slowest average heart rate recorded since start of session. Pitch of synthesized drone sound – crossfades through overlapping notes in C Melodic Minor scale – from B6 to A2
threshold points triggered by decreases in heart rate (/IBI/how-slow/32) musical notes and burst of colour. Circular, expanding bursts of colour from centre – fading out when they reach the edge of the frame. Highly reverberant electric piano sounds triggered when threshold crossed – synchronised with burst of colour. Pitch descends in C Melodic Minor scale according to decrease in heart rate
When participants sustain a slow relaxed breath pattern at around 6 breaths per minute, Frequency-domain analysis of heart rate variability will report the appearance of a ‘resonant peak’ around 0.1Hz (6 breaths per minute). There are six thresholds: 25, 30, 35, 40, 45, 50. Each time one of these thresholds is crossed – a message is generated that is used to control an audio and visual event /spectrum/resonant-peak-resonance A large, soft-edged blue ring expands slowly out beyond the edges of the frame and then slowly fades away.


Threshold 25 = yellow

Threshold 30 = yellow-green

Threshold 35 = green-yellow

Threshold 40 = green

Threshold 45 = cyan

Threshold 50 = indigo



Very soft, muted and heavily reverberated piano note, with slow decay


Threshold 25 = D#3

Threshold 30 = A#3

Threshold 35 = D#4

Threshold 40 = F4

Threshold 45 = G4

Threshold 50 = A#4

Table 1: showing relationship of key mappings in Distillery Waveforming heart rate controlled artwork. Table 1 lists the key heart rate variables and their mapping to the main visual and sonic representations. The most basic recreation of the work according to the scheme laid out in this table would still require instructions for obtaining and generating the modulation sources from the heart rate data: the algorithms that scale and interpolate the heart rate data and translate these beat-by-beat messages into smooth, continuous control signals.

The translation approach: calibration tools and resources

A second, more precise approach for reinterpretation provides a set of documents to help future developers interpret and translate the original code and .json preset data to provide an aesthetic experience more closely aligned to the artwork at the time it was acquired by QAGOMA (Fig 3). This information is contained in a set of calibration images and accompanying tables that provide a crucial link for reinterpretations of the artwork, allowing future programmers to determine how values stored in the original preset files relate to the appearance of each of the work’s 22 graphic layers. Many aspects of the prototype BrightHearts app’s interpretation of these messages are not linear in their response and it can be seen that the gradients for each shape blend differently according to hue (Fig 4). It is hoped that these calibration images will help future programmers to compare how their own code interprets the messages stored in the preset files, against the behaviour and appearance of original prototype BrightHearts iPad app.

Figure 4: Example of one of the calibration images and accompanying tables describing how the messages from the Max software are interpreted by the visualisation software of the BrightHearts (prototype) App on the iPad.


Summary of documentation strategy for future translation

Documentation element Description
Broad schematic mapping of real time heart rate statistics to sounds and visuals Describes the basic interaction concept and interaction experience: images and sounds controlled by slow changes in heart rate that can be influenced through slow breathing and relaxation/excitement.
Experiential aims and conditions for interaction Describes the environmental conditions proscribed by the artist – to ensure optimum conditions for interaction i.e. minimise audio-visual distractions.
Documentation of Max patch: Annotations in each section of Max code: the subsections (‘subpatches’, ‘abstractions’, and ‘bpatchers’) of the main file – describe the flow of information, through each section.

Document each section as a numbered image file, accompanied by notes describing how information is being modified/transformed.

Heart Rate Analysis
Visuals – misc. top-level controls i.e. manage storage and retrieval of preset data, transition to ‘live’ visuals, control overall size, hue, position etc.
Annotated table of preset values describing the mapping of heart rate information to the behaviour of the visuals, extracted from .json presets Describing how each layer responds to the various heart rate statistics, and the quality of response over time (i.e. ‘easing’, non-linear scaling etc.)
iPad visuals – Annotated Calibration Images and tables, Indicating how the visuals should look given specific layer-control messages i.e. diameter, hue, alpha etc. describing the idiosyncrasies of the visualisation code.

Table 2 – George Khut’s documentation strategy for “Distillery: Waveforming



For Khut, Distillery: Waveforming is foremost an experiential artwork and therefore his ideas about documentation focus on capturing its functionality and the aesthetics of the interaction. Khut sees the fundamental element of Distillery: Waveforming to be something other than the source code and the technical hardware: namely the mappings between breath and relaxation-mediated changes in heart rate and the appearance of the sounds and visuals, and how these mappings give form to the subject’s experience of interactions between their breath, heart rate and autonomic nervous system.

The modular Max patch programming and the presets in the .json file form, for the artist, the compositional heart of Distillery: Waveforming. This programming draws the visuals in response to the real time heart rate data: the key to the artwork. By further documenting the interactive principles independent from the current technology, drawing on approaches proposed in the Variable Media Questionnaire, Khut has developed reference documents that allow for the translation of the original preset data and calibration for future interpretations of the visualisation software.  In this way Khut can describe the artwork with greater clarity in a non-vernacular, opening up opportunities for the artwork to be recreated in alternate modes.

As an artwork in the QAGOMA collection, Distillery: Waveforming sets a precedent as the first prototype-artwork to be acquired. Technology-based digital artworks are prone to being superseded at a rapid pace and attempting to manage even the medium-term future for such artworks is perplexing. To gain the assistance of the artist at the time of acquisition is constructive and very beneficial, but to secure the commitment of the artist to engage in collaborative, long-term conservation strategies is extraordinary and this has resulted in the Gallery acquiring an unparalleled archival resource (Pagliarino 2015, p. 74). Although the Gallery maintains an interest and intention to preserve Distillery: Waveforming in its original developmental state, providing clear evidence of Khut’s ‘iterative’, evolving art practice, the archival resource provides scope to reinterpret the artwork at some point in the future when the original technology no longer functions as intended.

Through this process of defining the experience, the artist and the institution have collaboratively addressed their common and divergent interests in the future care of Distillery: Waveforming. These differing views have created an opportunity to better understand the artwork and its position as an asset within a state collection and a physical, historical link to an ongoing, evolving artistic practice. Khut’s continued interest in the preservation of Distillery: Waveforming and his participation in dialogues about this artwork and other iterations of the biofeedback project have provided the Gallery with an extraordinary reference and flexibility to manage and display the artwork long into the future.



Amanda PAGLIARINO is Head of Conservation at the Queensland Art Gallery / Gallery of Modern Art, Brisbane.  Since 2003 she has worked on the conservation of audiovisual and electronic artworks in the Gallery’s collection. Amanda received a Bachelor of Visual Arts from the Queensland University of Technology in 1991 and a Bachelor of Applied Science, Conservation of Cultural Material from the University of Canberra in 1995.

George Poonkhin Khut is an artist and interaction-designer working across the fields of electronic art, interaction design and arts-in-health. He lectures in art and interaction design at UNSW Art & Design (University of New South Wales, Faculty of Art  & Design) in Sydney, Australia. Khut’s body-focussed interactive and participatory artworks use bio-sensing technologies to re-frame experiences of embodiment, health and subjectivity. In addition to presenting his works in galleries and museums, George has been developing interactive and participatory art with exhibitions and research projects in hospitals, starting with “The Heart Library Project” at St. Vincent’s Public Hospital in 2009, and more recently with the “BrightHearts” research project – a collaboration with Dr Angie Morrow, Staff Specialist in Brain Injury at The Children’s Hospital at Westmead, Kids Rehab, that is evaluating the efficacy of his interactive artworks as tools for helping to reduce the pain and anxiety experienced by children during painful and anxiety-provoking procedures.


1970s Disaster Films: The Star In Jeopardy  –  Nathan Smith

Abstract: In this article, I marry star studies to haptic theory in order to explore the complex meanings of space and stardom in 1970s disaster films. I use the The Poseidon Adventure [1972] as my case study, a film often cited as one best epitomising the genre. I examine the way The Poseidon Adventure uses the physical space of the ship at the centre of the film to heighten the chaos of the sinking ship, and mediate the way we experience the (old and new) stars on screen. I consider how in disaster films we see great Hollywood stars battered, bruised, and beaten on screen and argue this allegorically signals a generic and cultural transition in American cinema, with old Hollywood film practices shed in favour of the politics and energies of New Hollywood. This paper offers insight into the underlying star politics of 1970s disaster films, which are often mediated only through the spectacle they provide audiences.

Figure 1. An original advertisement for The Towering Inferno where the two male stars are separated by the fiery burning tower in the centre, while their co-stars line the bottom of the poster.

One of the defining film genres of the 1970s was the disaster film. Depicting scenes of mass carnage, the disaster film came to prominence in the 1970s in a wave of films staging shipwrecks and airplane crashes, joined by a band of new and old Hollywood stars facing these dangers on screen (Keane 2001 2). Critically and commercially, the most successful films disaster films of the 1970s were: Airport (1970), The Poseidon Adventure (1972), and The Towering Inferno (1974), all of which lead to the popularity of other films or franchises including Earthquake (1974), Airport 1975 (1974), Airport ’77 (1977), and When Time Ran Out (1980) (ibid. 3). While critical scholarship on the genre has mostly centred on how disaster films were widely successful and critically lauded because of the way they allegorised America’s political, social, and economic plights on screen, many of these films actually instantiate other cinematic concerns. There is actually a dearth of scholarship that moves beyond the commercial and political currency of these disaster films.

Although the spectacles themselves in these disaster films are central, this essay assesses the way the disaster film utilises space on screen, in attempt to legitimise to their importance as a cinematic genre. Disaster films actively integrate discourses of cinema as travel experience and engage with the politics of star system to offer complex commentaries on cinema’s haptic meaning and the Hollywood star culture. I argue for the importance of combining the cultural and semiotic language of the star system with haptic theory in order to demonstrate the potential of extending the meanings of disaster films. It is only by marrying the two seemingly unrelated discourse that we can garner the more complex tensions that motivates so many audiences to see and engage with disaster films.

Stars in the 1970s disaster film have an allegorical and cultural function in the literal spaces contained within the text. Films like The Poseidon Adventure and The Towering Inferno embody allegorical meanings that comment on the death of the Hollywood star system and the rise of “New Hollywood” cinema (Roddick 1980 248).[i] I argue the disaster film engages in discourses on travelling and architecture in an effort to not only heighten the spectacle of its disasters, but also to create its own reference points about cinema and physical space on screen, in particular spaces for travelling (the ship, the plane, and airports). By reconsidering the haptic and star meanings of the disaster genre – examined centrally through The Poseidon Adventure here – the affects and emotional meanings of cinema and architecture can be understood away from the ocular-centricism that has dominated critical discourse on cinema for decades (Jay 1988 310).

Cinema has long been cloaked in its own aesthetic, critical, and discursive aura that has privileged the visual experience over the haptic pleasure it provides viewers (see Mulvey 1975; Sontag 1977; Deleuze 1983; Jay 1988; Thomas 2001). In recent years there has been a growing scholarly and artistic (recuperative) interest in resurrecting the intersections cinema makes with architecture. While these intersections have a longer tradition than the twenty-first century, it is only in the last decade that we have seen more explicit interventions – and marriages – between these two discourses. In Atlas of Emotion (2002), American scholar Giuliana Bruno constructs an “atlas” of philosophical, affective, and psycho-geographic responses to art, cinema, and architecture, coalescing all in an attempt to demonstrate how “site” (physical space) and “sight” (visual experience) are inherently married to each other. Likewise, The Architecture of the Image (2008), film scholar Juhani Pallasmaa examines how architecture – like cinema – works around ideas of time, space, and movement, and similar to Bruno, attempts to collapse the distinction between “sites” and “sights” in order to emphasise the potency of haptic embodiment.

These recent works highlight the emerging discourse of haptics and cinema, privileging the physical responses to architecture and cinema alike while de-emphasising the long-standing ocular approach to the world of film. Pallasmaa makes the point that every film contains an image of an architectural space (2008 4). From this, we can see an inherent relationship begin to operate between the two (ibid. 5). Whether it is a building, a room, or even specifically Central Park in New York City, film engages in the poetics and politics of architecture: they define space, grid and demarcate a physical area, while centring narrative meanings and affects around this site.[ii] Indeed disaster films – which are addressed in more depth later – are sound examples to demonstrate the potency of haptics – they are characterised by physical movement on screen. Whether it is a moving plane (the Airport series), a sinking ship (The Poseidon Adventure), or the chaotic physical destruction of Los Angeles (in Earthquake), they offer viewers a type of journey experience.

Many of these writers on haptics draw on Walter Benjamin’s seminal essay “The Work of Art in the Age of Mechanical Reproduction” (1936) in which Benjamin considers the notions of authenticity and originality in art. In his essay, Benjamin writes on the connections between cinema and architecture, arguing that on the contrary both are “tactile arts” (1936 225). While cinema seemingly stresses its visuality and architecture emphasises its physicality, both cinema and architectural space generate and provide kinesthetic experiences to viewers and participants. Indeed vision is just as important to haptics as haptics is to film. As Kester Tauttenbury writes,

Architecture exists, like cinema, in the dimension of time and movement. One conceives and reads a building in terms of sequences. To erect a building is to predict and seek effects of contrast and linkage through which one passes . . . In the continuous shot/sequence that a building is, the architect works with cuts and edits, framings and openings (1994 35).

By re-evaluating architecture in cinematic terms – and vice-versa by re-evaluating cinema in architectural terms – the two artforms imply a kinesthetic way of experiencing space. This intersection functions importantly in The Poseidon Adventure by problematising the architecture of stairs. Poseidon is concerned with the idea of “Hell Upside Down”, or an overturned cruise ship that forces its passengers to climb “up” to the bottom of the ship in order to escape. Staircases are an excellent example of how an architectural site has been utilised repeatedly by cinema to mark meaning, divide spaces, and represent political and familial hierarchies. For Peter Wollen, “The staircase is the symbolic spine of the house” (1996 15). Here is an instance where architecture and cinema unite. The staircases are places of movement and transition that are directly experienced by the body. The stairs in cinema stage political undertones of the private/public and unseen/seen dichotomy: either one can exit to a private space or enter a social public place through a set of stairs.

The Poseidon Adventure is driven by an intersection of architectural space and cinema. The film is about the “S.S. Poseidon” cruise ship making its last journey from New York City to Athens. On New Year’s Eve, however, the ship is overturned by an enormous tsunami caused by an underground earthquake (Keane 2006 72). The film is a metaphor for this cultural and cinematic conflation with architecture and stardom. Symbolically – because the ship is overturned – the notion of this staircase is inverted in Poseidon. Hence the characters do not “climb up” the ship but “climb downwards” to the ship’s haul.

A major turning point in the narrative is when one of the surviving co-captains tells all the passengers to stay where they are (in the central dining room, near the top deck of the ship) since they are closer to the top deck of the ship. Reverent Scott (Gene Hackman) pleads with them not to listen to the co-captain and instead them that they must go “up” (down) to reach the ship’s haul, since any attempt to escape needs to be made closer to the water’s surface. Although allowing a means for escape for a select few, this inversion figuratively reverses the hierarchies of title and power on the ship and placing a small group of the passengers in charge of the ship, as the co-captain drowns (and, by extension, allows others to drown) in the sinking vessel. I would push this analogy further, arguing how this inversion demonstrates the allegory of a transition of Old Hollywood values into that of New Hollywood. Given that it is the younger Gene Hackman – the then star of the drug-themed crime film The French Connection – who leads the fight for survival (leaving the mostly-older ship-goers to drown), the allegorical meaning is rich. Indeed this moment demonstrates how Poseidon privileges its younger stars and begins to kill off its older stars, seen frequently in other 1970s disaster films (see Dyer 1975, 1979).

poseidonThe “guiding” cinematic experience of these disaster films – in particular Poseidon – is the way it exploits cinema as a travel experience. In Atlas of Emotion (2002), Bruno writes that “film is affected by a real travel bug” and that the “film ‘viewer’ is a practioner of viewing space – a tourist” (76, 61). Bruno asserts cinema has always been preoccupied with the travel experience, signalling how early cinema itself was composed on narratives of travelling to the moon (Goerges Melies’s A Trip to the Moon 1902), to outer space (Georges Melies’s The Impossible Voyage 1904), or train travel (Lumiere Brothers’ The Arrival of the Mail Train 1896).This ongoing preoccupation with not only capturing travel movement (trains, travel) onto the moving image itself but also with providing viewers with the experience of “statically travelling” is what informs the terms of references of the disaster genre (Bruno 2002 7). Given the socio-cultural climate these films were made in – with the rising popularity of the transgressive styles of New Hollywood and their interest in new filmmaking techniques and strategies – the disaster film drew on these early cinematic discourses on travel as a means to rework their cinematic and cultural meanings. Here they are reframed in light of the emerging 1970s counterculture and anti-establishment politics (Wood 2003). By this, the template of cinema as traveling/traveling as cinema began to be manipulated and exploited in texts like Poseidon and the Airport series through the presence old Hollywood stars alongside new ones.

The poetics embedded within the moving space (ship, aeroplane, blimp, bus, rollercoasters, all of which were vehicles utilised in one disaster film or another) are what serve as the central cultural, political, and cinematic meanings that the disaster film seeks to problematise. As Bruno notes, “There is a mobile dynamics involved in the act of viewing films, even is the spectators is seemingly static. The (im)mobile spectator moves across an imaginary path, traversing multiple sites and times” (2002 55). What Bruno addresses is the way film negotiates physical spaces on screen viewers can vicariously experience, while also emphasising the staticity and immobility of the viewer in their seat all at once. If we consider Poseidon – often cited as the cinematic epitome of the disaster film – the captain calls the ship in the film “a hotel with a stern and boor stuck on each end” (Roddick 1980 246). The emphasis is on the ship as a site/sight of luxury, relaxation, and pleasure. The irony in this comment, as we see later, is that indeed it is the fact that the Poseidon is not a hotel that causes it to sink – it is a ship. This comment literalises the paradox Bruno address: the hotel does not move; a ship does. These series of meanings evoke the cultural and connotative meaning that representing travel on screen embodies in the disaster genre.

The moving space – whether it is encaged within the narrative of Poseidon Adventure or in the moving airplane in the Airport series – represents an intervention in the spatial and haptic experience of cinema. In terms of narrative, the site/sight of the exploded airplane, the sinking ship, or the burning hotel makes the spatial environment contained within the film dangerous, claustrophobic, and unsafe. Bruno observes, “Film inherits the possibility of such a spectorial voyage from the architectural field, for the person who wanders through a building or site also absorbs and connects visual spaces … In this sense, the consumer of architectural (viewing) space is the prototype of the film spectator” (2002 55). Bruno’s dialogue between an architectural space and the physical space controlled by the film instantiate many of the same haptic experiences and her re-prioritisation of the affective (as opposed to the visual) qualities of cinema is what I utilise in my analysis here. The disaster film itself – in this case, Poseidon – although preoccupied with spectacle, equally engages with problematising the relationship between physical space on screen and the cultural status and capital of its old Hollywood stars.

What makes the disaster film so palpable as a cinematic experience is the way the old Hollywood stars of the film are battered, bruised, and ultimately killed off (see Dixon 1999, Feil 2005). Richard Dyer writes that while a star is a reflection of the dominant social and political ideologies, they also are symptomatic of the “fissures” in these hegemonic ideologies and have the potential to be read in profoundly different ways (1979 3). I argue two central points on stars within the disaster genre: the death of the stars in the disaster genre allegorically comments on the decline of the Hollywood studio system which had dominated America film-making since the 1930s; and second, the stars act as stand-in for the audiences to vicariously experience the wreck and ruin wrought by the destruction of the travel experiences. Bruno argues this when she writes that we must move from “optical to haptic” (2002 6). This approach figures importantly in my reconsideration of the disaster film. My contention is that Poseidon kills off its former Hollywood greats as a figurative attempt to signal the death of the Hollywood studio and star system, and legitimise the emerging tides in Hollywood cinema.

Therefore, the cultural, cinematic, and semiotic meaning of the star functions importantly in the 1970s disaster film. The Poseidon Adventure and The Towering Inferno are major films that demonstrate the intersection of spatial destruction and the harm this carnage reaps on its Hollywood stars. The original advertisements for both films highlight how the stars are the central identifying features for mediating the experience of the film. The posters also imply the internal disasters within the narrative threaten the stability and indeed safety of these stars within the confines of the story. On a poster for The Towering Inferno (Figure 1), the star image of the actors is central with Steve McQueen and Paul Newman as the male leads battling not only each other (and their masculine bravodo) but to attempting to control the blazing skyscraper. The poster demonstrates the intersection of destroying the stars through floods, fire, chaos, and death. This advertisement also has a row of other Hollywood stars that appear in the film – such as William Holden alongside Faye Dunaway – and although these are ostensibly the supporting cast of the film, nevertheless, they too are presumably encaged within the wrath of the fiery tower. The original movie poster for Poseidon (Figure 2) evokes a sense of claustrophobia with its stars bordering a drawing of waves of water overflowing a ballroom. Called “Hell, Upside Down”, the graphic is engulfed with seawaters. The poster encourages viewers to believe that not all the stars of the film will be saved by narrative end. These advertising meta-texts embody the important formation these films bridge between the collapsing of space and the destruction of the star.

To demonstrate this transition, I examine Shelley Winters’ role in Poseidon. The most striking physical feature of Winters’ performance as Mrs Rosenburg is her overweight and bulging body. Winters purposely gained weight for the role and insisted that she do all her own stunts (Keane 2006 76). A successful supporting actress throughout the 1940s and 1950s, Winters’ re-emergence in this film allegorically aligns itself with the transition of Old Hollywood to New Hollywood.[iii] With the rise of more countercultural subjects and constructed around less bureaucratic institutions such as film studios, New Hollywood sought to dislocate the institutionalisation of films with the rigid studio structure (Berliner 2010 62). Winters, who embodies these cultural, star, and cinematic ties to the studio system, is forced in Poseidon to undergo a gruelling and exhausting escape that she is truly not capable of. Symbolically, her bulging and overweight body is suggestive of the hedonism and profits of Old Hollywood that is no longer able to meet the challenges of cinema of the 1970s. Winters’s Mrs Rosenburg is a figure of comic relief as the carnivalesque quality of her squatting, falling, and struggling index her weight and the fact this obese figure is the great Shelley Winters – or “that fat old cow” as another character calls her. Therefore, Winters – like Jennifer Jones’s character in The Towering Inferno – metonymically acts as a figure of the old studio system.[iv] Both Jones and Winters not only signal the star system but also are utilised by both films to heighten the danger, destruction, and claustrophobia nature of these restrictive and deathly spaces on screen (Yacowar 1977).

The disaster film “must be consider[ed] as a single group” and as part of “a long tradition of screen catastrophe”, fitting in with traditions of cinema representing spectacles on screen to heighten their haptic value (Roddick 1980 244). I argue disaster films declined in the late 1970s and early 1980s because they became an overused template that could not meet the demands of the changing cultural landscape. However, the template resurfaced in other different generic arenas such as the sci-fi action (The Terminator [1984], Die Hard [1988]), while being altered in the 1990s with less stars (Twister [1996]), or simply remade in the 2000s (Poseidon [2006]). Unlike the original 1970s disaster films, however, these later generic templates are less concerned with an ensemble of stars and are instead driven more by special effects (Keane 2006 101). As with the changes in cultural demands, the later series of films with disaster as a main narrative and thematic pull did no need to alter the star status or star value of their actors unlike the 1970s series of films – they were the reference points with which to exploit and parody the star to batter and bruise them in the film.

In this way, the disaster films of the 1970s instantiate many more complex and important cultural and cinematic meanings than existing scholarship suggests. Indeed, while the literature on the disaster film has mostly considered the financial, aesthetic, or commercial value of these films, this essay has privileged the star iconography and haptic meanings of these films to revise their cultural and cinematic value. Although this essay has only concentrated only on The Poseidon Adventure, this film nevertheless embodies many of the tropes of the disaster film, having been cited as the “epitome of the genre” (Roddick 1980 246). The disaster film marries the concerns of architecture with the haptic meanings of cinema, utilising space as a means to comment on the status of stars in the 1970s cultural milieu. Given the rise of scholarship examining the haptic experience of cinema, reconsidering bodies of cinema that have only be considering for their visual value has immense importance in understanding and legitimising some of the more meaningful concerns these films embody. This essay has explored the growing scholarship considering the physical responses cinema can provide spectators while also revising the dominant interpretations of the disaster film, arguing that in particular The Poseidon Adventure draws on histories of film as a travel experience. Ultimately, in reprioritising the spatial and star elements in films like The Poseidon Adventure, the haptic and affective experiences can be more palpably felt.




When a Good Girl Goes to War: Claire Adams Mackinnon and Her Service During World War I – Heather L. Robinson


Claire Adams Mackinnon and her contributions to the war effort 100 years ago are largely forgotten. The product of two Canadian military families, she put aside her burgeoning film career when war broke out to train and work as a nurse before returning to the silent screen. This article examines available evidence to reconstruct this period of her career (1914-1919), encompassing her nursing experience, her role in a fundraising drive for the US Red Cross, and her starring role in a government sexual hygiene campaign, which ignited one of the first censorship storms of the early film industry. I will argue that the choices Adams made reflect not only a determination to aid the war effort, but also place her at the vanguard of the women’s movement during the volatile years of the mid-1910s. This successful actress, who spent the second half of her life in Melbourne and regional Victoria, has largely been forgotten by film history. However, by placing this early period of her work within a firm historical context, one dominated by the fight for women’s suffrage, sex education and the First World War, we gain an appreciation of her significance within the early film industry and the origins of her ongoing community service.



For anyone with a passing interest in silent film history, Claire Adams Mackinnon is an intriguing transnational figure. A British subject for much of her life, she was born in Canada to an English father in 1896. A singer and an actress, on both stage and screen, she was a child performer using her real name “Beryl” Adams before gaining success as a motion picture actress in New York. As a teenager in early Thomas Edison productions she was known first as “Clara” then “Peggy” Adams before publically assuming her family pet name “Claire” in 1918. In 1920 Adams moved to Los Angeles and enjoyed an eight-year Hollywood career. In 1931 she became an American citizen prior to spending the last forty years of her life as an Australian.[1] Today, however, she remains largely unknown, despite appearing in some of the silent era’s most influential, popular and profitable films alongside some of the best-known artists and producers of the era.

If Adams is remembered at all it is within select circles in Victoria, Australia. Following the death of her first husband, Adams enjoyed a whirlwind courtship and marriage to Donald Scobie Mackinnon, the scion of a Melbourne legal and horseracing family. Together they reinvented the Western District homestead Mooramong, creating a jazz age delight redolent of the Hollywood Hills lifestyle she had left behind. Now in the hands of the National Trust of Australia (Victoria), Mooramong remains defiantly anachronistic in a district better known for its bluestone piles and weatherboard restraint, a monument to a much-loved couple reflecting their sophisticated tastes and cosmopolitan life experiences. Mrs Mackinnon’s career is reduced to a prelude to her second marriage, like a folly of youth indulged before her adult life began.

Figure 1: Mooramong homestead, Skipton, Victoria. Image used courtesy of National Trust of Australia (Victoria).

Figure 1: Mooramong homestead, Skipton, Victoria. Image used courtesy of National Trust of Australia (Victoria).

An intensely private person throughout her life, Adams was enigmatic from her earliest interviews and has left little evidence to enable a simple or personal interpretation of her career.[2] Unlike many of her more famous contemporaries, she was not a movie star created by a studio publicity department: she was an experienced working actress, appearing across genres for a number of companies and eschewing the celebrity lifestyle. This independent approach resulted in a comparatively low public profile.[3] Nor was Adams in Hollywood when the studios began to construct their historical legacies. As such, she appears infrequently as a footnote in popular and scholarly accounts of the silent era, including the one written by her first husband, Benjamin B Hampton.[4] Eminent film scholars welcomed Hampton’s text as an authoritative account of the history of the early film industry.[5] If even her husband did not think she warranted more than an image caption, one could easily conclude that Adams was not terribly significant. However, as Lesley Speed suggests, Adams’s career is significant not only for the continuity of her workload, but for the time in which she was active, “spanning a period of major changes in the American film industry, including the transition from short films to feature-length productions, the industry’s relocation from the East coast to Los Angeles, the establishment of the star system and the formation of vertically-integrated major studios” (Speed 2015, 4).

Like the majority of women who dominated the ranks of the early motion picture industry, Adams has fallen through the gaps of film history’s indeterminacy.[6] However, as more scholars analyse the period and primary sources become more accessible, we are able to reconstruct a more accurate representation of the industry, the individuals and the era during which Adams was active, and the picture that emerges is one dominated by the women’s movement. Women made up at least 83% of the cinema audience at a time “when women’s voices were particularly valued” (Stamp 2012, 6). Women’s history scholars note that from the late 1910s “young women could not help but be influenced by the currents of the age” (Banner 1974, 151). As a young woman in New York City during the First World War, Adams was surrounded by the opportunities, debates and challenges confronting the first generation of young women free to pursue a career as a choice rather than a necessity. She was amongst the target audience for orators and activists such as Margaret Sanger promoting the birth control movement and key figures campaigning for Women’s Suffrage, which would not pass into legislation until 1920. We have no evidence that Claire Adams identified herself as an early feminist. However, through her career choices, she reflects the values and ideologies synonymous with the movement, a cultural catalyst which grew in parallel with the development of the motion picture industry.

By mainly focussing on two of her early feature films, this article will demonstrate that Claire Adams deserves recognition not only as a well-respected, adventurous and greatly admired actress of her day, but also as a young woman navigating the perils and opportunities opening up for women in the early twentieth century. She is also a representative of the forgotten contributions of countless women working in early cinema. I will examine her work of one hundred years ago, when she stepped away from Edison Studios and the early motion picture industry to train as a nurse, only to return to motion pictures in support of the troops heading off to the battlefields of Europe. By placing these early works within a firm historical context we will gain a more complete understanding of the scale and intersections of her careers, as both nurse and actress, her role at the vanguard of contentious social issues of the mid 1910s and her active participation in landmark moments of the early American film industry.

Real Life Drama

According to the Australian Dictionary of Biography, Claire Adams worked as a nurse for the Red Cross during World War I (Maxwell, 2000). This description conjures evocative images of mud and blood, battlefields and field hospitals. What drove an aspiring young actress away from the movie studio and into a hospital theatre? What data is available to determine how much official nursing she contributed to the war effort one hundred years ago?

Figure 2: Figure 2. A young ‘Clara’ Adams (centre) with co-stars from left: Marion Weeks, Yale Boss, Elsie McLeod and Gertrude McCoy in The Office Boy’s Birthday (Charles M. Seay, 1913). From Kinetogram, 1 January 1913. Source: Seaver Centre for Western History Research, Natural History Museum of Los Angeles County.

Figure 2:  A young ‘Clara’ Adams (centre) with co-stars from left: Marion Weeks, Yale Boss, Elsie McLeod and Gertrude McCoy in The Office Boy’s Birthday (Charles M. Seay, 1913). From Kinetogram, 1 January 1913. Source: Seaver Centre for Western History Research, Natural History Museum of Los Angeles County.

In 1912, Adams began performing for Thomas Edison’s film production company under the stage name “Clara” Adams and appeared in 17 motion picture shorts in two years (Internet Movie Database, 2015). At the outbreak of war in 1914, she was almost 18 years of age. Both her grandfathers had served in the British military in Canada and India before entering business and public service (Letourneau 1982; Manitoba Historical Society 2009). This family tradition and tales of exploits on the battlefield inspired Adams and her brother Gerald as children (Will 1978, 30). As young adults, they were determined to emulate their forebears and sought opportunities to volunteer for the war effort.[7] Together they had grown up in an environment that was conservative and under the influence Anglo-Victorian social strictures of duty and community service. These strictures included a strong sense of volunteerism, which saw upper- and middle-class young people from across the British Empire come forward to support their King and Country (Quiney 1998, 190). Adams set aside acting to make an active contribution to the war effort.

Like thousands of young women across the British Empire, Adams applied to join the Voluntary Aid Detachment or V.A.D service, a corps of nurses and ambulance drivers trained and managed for the Red Cross by St John’s Ambulance. However, owing to her youth and inexperience, she was turned away (“Claire Adams, Starring in ‘The End of the Road’ Is Real Western Canada Girl”, 1920). She could not change her age but was able to address her skillset. Adams enrolled at Detroit’s Grace Hospital Training School for Nurses, one of the most respected institutions of its kind in North America (“Claire Adams, Starring in ‘The End of the Road’ Is Real Western Canada Girl”, 1920). Grace Hospital was just over the border from Canada on the shores of Lake St Clair, close to Toronto where she was living with her father Stanley Adams (Sweet & Stephens transcript). By 12 June 1915, Adams was listed on a ferry passenger manifest as an eighteen-year-old student nurse travelling between the two cities (, 2009).

Unfortunately the enrolment records for the Grace Hospital Training School for Nurses from this time period do not survive, but the institution’s prospectus does, specifying a strict and rigorous lifestyle for their live-in student nurses.[8] This training would equip the young ladies to assist senior staff with both male and female patients in the hospital’s surgery, children’s ward and the obstetrics wing. Other extant records from Grace Hospital include the lists of graduates, and Adams’s name does not appear there either.[9] However this would not have prevented her from gaining employment as a nursing assistant, especially at a time when people with any medical or first aid experience were in high demand. Many students enrolled and failed to make it through the rigors of training or were unsuited to the prohibitive lifestyle required of those living in (Kathleen E. Schmeling, pers. comm.).

The Canadian Red Cross website describes how anyone working as a volunteer at a hospital during the Great War was referred to as a Red Cross nurse. Unfortunately records of the thousands of people who gave their time and energies to help have been either destroyed or did not exist in the first place.[10] What do still exist though are the Canadian census records taken in 1916 (, 2009). They record that Adams was staying with her mother in Winnipeg and working as a “doctor’s assistant”. As happened across the Empire, she was most likely one of the thousands of women who volunteered to work in their local hospital. Deer Lodge was one of many country houses and hotels across Canada converted to military hospitals dedicated to convalescent soldiers returning from the front (Deer Lodge Centre, 20014). The English Government was short of both funds and infrastructure to deal with the number of wounded generated by the new means of mechanised warfare. Deer Lodge was established as a repatriation hospital in Winnipeg in October 1915 following a request made by the English Government to Commonwealth nations contributing troops to the war. Each country or dominion of the British Empire became responsible for the convalescent care of their own when casualty numbers exceeded British expectations and resources.[11]

Moving back to Winnipeg to reside with her mother Lillian, a piano teacher, also offered Adams the chance to participate in another historical event. In January of 1916, the women of Manitoba Province were the first in Canada to be granted the right to vote in municipal elections and hold public office. This occurred 2 years before most of their countrywomen and 4 years ahead of women in the United States (Jackal & Millette, 2015). Even during times of war, this would have been something worth celebrating and hard to ignore.

Following the Battle of the Somme in July 1916, the number of wounded Canadian troops requiring convalescent care rose from 2,620 in July 1916 to 11,981 by the year’s end (Veterans Affairs Canada, 2015). Adams may not have experienced the action on the front, but she would have witnessed the consequences it had on soldiers. In whatever capacity she was serving, she would have been expected to meet the demands of a near five-fold increase in the number of shattered patients, some approximately her own age, in the makeshift over-crowded wards ill-equipped to deal with them.

In 1916, Adams was still only 19-years-old. The physical toll of nursing accompanied by emotional exhaustion were the reasons Adams later gave to explain her return to motion pictures, admitting that she “collapsed finally under the strain of her work” (Goldbeck 1920, 27). Although equipped with some training and experience, she may not have been capable of maintaining an emotional distance or remain objective from the suffering she confronted. Well before the Armistice was declared in November 1918, she was back with Edison Company performing for the cameras. In doing so, she continued contributing to the war effort, but it was on a scale that exceeded anything she could have achieved had she remained a doctor’s assistant in Winnipeg.

Returning to the screen in 1917 with the stage name “Peggy”, Adams appeared in 8 Edison Productions. These included Barnaby Lee (Edward H. Griffith, 1917), her first feature film running over four reels (48 minutes), and the first of several in which she was directed by Edward H Griffith.[12] She starred in Your Obedient Servant (Edward H. Griffith, 1917), the first adaptation of Anne Sewell’s classic novel, Black Beauty (see Figure 3). Her next production, Scouting for Washington (Edward H. Griffith, 1917) was a melodrama set during the American Revolution. Wild Arnica and Shut Out in the Ninth (both Edward H Griffith, 1917) show a very young Adams, all dimples and dark curls, revelling in simplistic comedy.[13]


Figure 3: Claire Adams with Pat O’Malley in Your Obedient Servant, (Edward H. Griffith, 1917). Edison Conquest Pictures. 1917. Source: UC Irvine Special Collections and Archives.

Figure 3: Claire Adams with Pat O’Malley in Your Obedient Servant, (Edward H. Griffith, 1917). Edison Conquest Pictures. 1917. Source: UC Irvine Special Collections and Archives.

The following year, Adams would make the pivotal films of her early career. They started with a short vignette, one of a series of social satires titled Girls You Know. It would be the last time she appeared in a motion picture produced by the Edison Company. Each sketch “featured a different type of American girl personally selected” by James Montgomery Flagg, the prodigious illustrator, cartoonist and photographer (“Edison Releases Flagg Social Satire Series.” 1917). The First World War was something of a career peak for Flagg. He’d had film experience, was a widely read social satirist and artist and produced recruitment and patriotic propaganda posters for the US government. Flagg’s most recognised image was his creation of Uncle Sam, the grizzled old patriarch on the recruitment posters who “Needs You for US Army”. [14]

Flagg was commissioned to provide the “one-reel photo-sketches … depict(ing) the grace, charm, foibles and frailties of ‘Girls You Know’, the likable types” (“Recollections of ‘Girls You Know’”, 1918). These “likeable types” featured such stereotypical characters as “The Art Bug”, the “Spoiled Child”, the “Bride” and “The Artist’s Model”. Adams was cast as The Man-eater (Jack Eaton, 1918). The influence of the war in this film is obvious. Adams’s character Nina bids a tearful farewell to her uniformed fiancé and dreams of medals pinned to his broad manly chest. Nina is a sprightly little flirt travelling to a picnic with friends, harmless but annoying in her need for male attention. She is eventually “doused off”, pushed into the river, pulled ashore then scolded like a half-drowned kitten.[15]

The Girls You Know vignettes were produced as light and satirical entertainment for the troops and for those on the home front. They also reflect the restrictive caricatures of young women in the early motion picture industry, characterisations of women indulgently pursuing or experimenting with a career (“The Artist’s Model” and “The Art Bug”). Others were defined by their relationships with parents or male partners (“The Spoiled Child”, “The Bride” and “The Man-eater”). As “the Man-eater”, shunned and dunked for her sexually assertive manner, Adams’s character suffers the entrenched social consequences as punishment for her morally dubious behaviour. The cast of the series were promoted as having been “all selected from (Flagg’s) models – the most famous in New York”, though featuring leading women “with no experience but clever ideas and no actresses of note” (“Recollections of ‘Girls You Know’”, 1918). Adams, however, was by this stage a well-established performer, and three of her four “Girls” colleagues also had previous motion picture experience.[16] Flagg, the producers and indeed audiences, may have been more attracted to a fresh pretty face than a talented individual with an identity, reflecting the low opinion many still held of the acting profession.[17] There is no small irony that she was at this time living against tradition, earning a respectable wage and supporting herself independently as a professional young woman.[18]

The Spirit of The Red Cross, (Jack Eaton, 1918)

When the United States entered the European conflict in 1917, the US chapter of the International Red Cross was charged by the government to raise funds and volunteers to maintain an active presence on the battlefields of Europe. The subsequent rise of American humanitarianism and the specific formation of a nation-wide Red Cross campaign coincided with a period when “sensationalistic mass media began to dominate American culture”. These forces, dominated by new motion picture technologies, were responsible for the “reshaping American ways of seeing, feeling, and responding to suffering by treating violence and pain as pleasure-producing commodities” (Rozario 2003, 426).

The Red Cross recognised the timeliness of motion pictures as an effective and expedient medium for stimulating potential donors and volunteers across the country (Rozario 2003, 429). The National Association of the Motion Picture Industry (NAMPI), chaired by famed producer Jesse Lasky, was appointed to the American Red Cross to promote and “interpret the organization for the American People” through the production of a suitable motion picture (“Spirit of the Red Cross 1918). Flagg was engaged to write the script. Jack Eaton, also of the Girls You Know series, was brought in to direct. Adams was cast as the lead, a nurse named Ethel, the image and embodiment of the International Red Cross. Lasky’s production expertise ensured that the resulting film, The Spirit of the Red Cross (Jack Eaton, 1918) presented “some of the best work done in motion picture making” (“Spirit of the Red Cross 1918). With such an accomplished and professional team involved, this 2-reel film was certain to not only hit but exceed the patriotic, inspirational and philanthropic targets for which all parties were aiming.

Adams’s lead character Ethel was the sweetheart of “an American youth” Sammy, played by Ray McKee. When Sammy enlists with the army and sails for France, Ethel volunteers as a nurse. The film portrays the work of the Red Cross in the field, assisting refugees, nursing and caring for the wounded. The New York Times described Sammy heading for the battlefield, envisioning “Ethel in her white uniform, watching over him”, until:

After a charge, he lies on the ground with a bullet in his chest, half conscious. The vision of Ethel awakens him, just as a German comes forward slaying the wounded. Sammy grips his revolver and shoots the enemy. Later, removed to base hospital, Ethel finds him and nurses him back to health. [19]

The Red Cross were not averse to engaging exaggerated propagandist images for the benefit of their cause.[20] Their tactics owed much to the pulp media, sensationalist newspapers and war magazines, aiming to provoke the outraged and patriotic public into “acts of benevolence” (Rozario 2003, 430). What makes this campaign remarkable is that it was the first time such techniques had been mobilised so effectively on a broad national scale, utilising the emotive mass cultural medium of motion pictures.

The American Red Cross expected their national fund raising drive to receive “considerable impetus” from the release of The Spirit of the Red Cross (“Spirit of the Red Cross”, 1918). Newspaper listings from across the country noted that cinema proprietors were donating all takings from the screening to the cause.[21] The New York Times quoted that the goal of the organisation was to raise one hundred million dollars, or around $1.8 billion in US dollars today (Measuring Worth, 2014). By the time the Armistice was declared in November 1918, only 7 months after the film’s release, the Red Cross had raised approximately four times that sum (American Red Cross, 2014). The film is credited with playing a major role in the campaign’s success. By the war’s end, one-third of the US population, approximately 39 million people, were either contributing members of the Red Cross or serving as volunteers.[22] At the cessation of hostilities, war movies were the most successful types of films in the country and “spectacle was in the ascendant” (Rozario 2003, 439).

The promotional poster Flagg created for The Spirit of the Red Cross depicts the diaphanous vision of Adams’s character floating over the battlefield. Ethel valiantly directs stretcher-bearers toward the unseen wounded. The plaintive caption declares: “Not one shall be left behind” (See fig. 4). For the many, perhaps the millions, who attended the film or saw the poster, Adams represented the thousands of nurses serving in the field, saving the fallen and tending the refugees, even though she was not entirely happy with her performance (Lake 1922, 51). She may not have been directly ministering to the wounded any longer. However, through her contribution to this film, Adams achieved more for the war effort as a nurse on screen than would have been possible had she remained one in life. Well into her old age it was commonly accepted and reiterated in biographical entries that Adams had been a nurse for the Red Cross during the First World War. Having taken part in such a high profile and hugely successful campaign, it is fair to say that Claire Adams was not simply “a nurse” for the Red Cross: she was The Nurse, the face on the movie poster, the selfless guardian of the wounded and indeed the embodiment of The Spirit of the Red Cross.

Figure 5: Figure 4: The Spirit of the Red Cross (James Montgomery Flagg, 1918). Source: World War 1 Posters from the Elizabeth Ball Collection, Ball State University Archives and Special Collections. Ball State University, 2011. All rights reserved.

Figure 4: The Spirit of the Red Cross (James Montgomery Flagg, 1918). Source: World War 1 Posters from the Elizabeth Ball Collection, Ball State University Archives and Special Collections. Ball State University, 2011. All rights reserved.

Adams was well positioned to maximise the opportunities offered by both the emancipation of women in general and the freedoms and opportunities offered by the motion picture industry. At the dawn of the Twentieth Century, middle-class women in North America were still expected to abide by the traditional Victorian roles of marriage, motherhood and marital obedience. As the 1920s loomed, however, “a woman’s alternatives to marriage were not only possible but exciting” (Banner 1974, 48). This social shift was influenced by women’s experiences backfilling male positions in the workforce and volunteering for the war effort (Quiney 1998, 193). By the time Armistice was declared, Adams was twenty-two years old with an independent motion picture career, an exciting new option embraced by increasing numbers of young women in the West. As such she exemplifies the symbiotic relationship between two new cultural forces at play, demonstrating that “Cinema history and history cannot be separated from one another”: “In the field of tension between the history of cinema in the 1910s and gender history, a movement of emancipation takes place” (Schlüpmann 2013, 24).

Women were not only welcomed but were highly valued in the early motion picture industry, occupying positions from theatre staff to major writers and directors. Like Adams, despite their prominent positions, most did not publically identify themselves with the early feminist movement (Slide 1996, 1-3). Crucially Adams had also proven she had the talent, experience and determination to excel. Having starred in the commendable The Spirit of the Red Cross, she had also gained her family’s acceptance and support (Lake 1922, 51). From 1919 she was known publically not as “Clara” or “Peggy” but as Claire Adams, the “pet name” favoured by her family. Following The Spirit of the Red Cross, there were several projects to which she would dedicate her talents. One of these was a leading role in Romance and Brass Tacks (Martin Justice, 1918), “the new Flagg comedy starring pretty Peggy Adams, the famous Broadway beauty” (“An Unusual Comedy at Vining Theatre” 1918). The next motion picture Adams found herself starring in was another high-profile government sponsored film, also made for the war effort and proving remarkably popular. However, it also drew the kind of attention that few respectable young ladies from Winnipeg may have welcomed. Adams starred in one of the first major censorship scandals of the motion picture industry.

The End of the Road (Edward H. Griffith, 1919)

Young girls, thrilled with patriotism, sometimes fail to realize that the uniform covers all the kinds of men there are in the world; men of high ideals … or, in the worst instances, men who feel that their own physical appeal must be gratified, no matter who suffers. And so, through ignorance, through emotion, take steps which will lead to bitter regret.

(Katherine Bement Davis from Colwell 1998, 73)

Following The Spirit of the Red Cross and the additional projects with Flagg, there is evidence that Adams returned to theatre. In May 1918, she is named amongst the cast members of a patriotic play called Loyalty, written by George V. Hobart and staged at the Belasco Theatre in Washington DC. Belying both Adams’s preference for patriotic pieces as well as the public’s support for them, Loyalty is described as:

Something more than a mere dramatic entertainment … the play also carries a message of hope to a world suffering from the hardships and horrors of war (“Loyalty Will Make First Appearance At Belasco Tonight” 1918, 14).

The war may have been over but the horror continued on the home front. The “The Spanish Flu” evolved in the trenches of Europe from a moderately innocuous virus into something quite lethal. The appearance of the disease in New York City, borne by returning soldiers, was first reported in August 1918. The vius attacked the young and healthy, becoming more infectious and lethal with each mutation. Many industries, including motion picture companies and theatres, were unable to operate normally or in some instances were forced to close. Adams may have been forced to take time off or found work options cancelled or postponed. By the time the city recovered months later, it had lost 20,000 to 24,000 of its six million citizens. (Dominus, 2009). However, influenza was not the only infection rife amongst the troops.

In her essay, “The End of the Road: Gender, the Dissemination of Knowledge, and the American Campaign against Venereal Disease during World War 1”, Stacie A. Colwell provides a detailed account of the social and political context which gave rise to Adams’s next project (Colwell 1998, 44-82). As the US entered the war, the number of new recruits infected by venereal diseases appalled doctors conducting medical examinations.[23] As had been customary for generations, many young men of the time were engaging in unprotected pre-marital and extra-marital sex, ignorant or uncaring of the consequences. This practice was intensified during the heightened emotional climate of the war. Young women at the same time were becoming more personally independent, challenging the social order and demanding equal rights and freedoms to men. These included sexual freedoms.[24] They were unfortunately doing so within a vacuum of ignorance, a symptom of the perennial code of dual morality encouraging young men to explore their sexuality while young women were kept sexually innocent, unaware of their body’s functions and frailties. Many, deeply smitten, unknowing or unable to say no, found themselves bearing the physical and social consequences of sex with strangers or sweethearts they feared they’d never see again. The result was a perfect storm of ignorance, class-based bigotry and misbegotten morality, and Adams stepped right in the middle of it.

Eugenics, racism and class tensions exacerbated both the fear and apportioning of blame for the spread of venereal diseases (Schaefer 1999, 21). At a time when eugenics was still considered to be a viable scientific theory, some believed that the spread of sexually transmitted diseases (STDs) such as syphilis and gonorrhoea served a deliberate political or religious agenda.[25] Others saw the epidemic as an insidious weapon activated by the working masses, crowding into industrialised urban centres to breed out the richer, more refined classes (Colwell 1998, 72). The front line troops in this intimate means of class warfare were believed to be not simply prostitutes (who had a surprisingly low level of infection due to regulatory frameworks and inspections of brothels), but liberated “diseased and promiscuous” women seducing innocent men of means who then passed the infection onto their wives (Colwell 1998, 72). In turn, these ‘pure’ middle- and upper-class women would be driven mad, infertile, or both, by the ravages of the disease.[26] With many changes being made to the social order and a popular culture rife with inflammatory propaganda regarding invaders from abroad, there’s little wonder that this kind of imbalanced fear and paranoia infected the home front, like another disease.

The U.S. War Department, ever the pragmatists, saw the issue quite clearly – sick soldiers were bad for morale, an additional expense and made for a weakened army (Colwell 1998, 47). The challenge of addressing this issue fell to the War Department’s Committee on Training Camp Activities. This committee was confronting disciplinary issues relating to groups of young, independent and assertive New Women moving to towns and areas surrounding the training facilities. Schaefer implies that some were probably entrepreneurial prostitutes. Others were most likely “the khaki mad girls”, opportunists who simply loved a man in uniform and weren’t afraid to show it (Colwell 1998, 73). The troops themselves, predictably, were considered relatively blameless.

The momentum generated by the war to stem the spread of STDs provided an opportunity for diverse community organizations with similar concerns to affiliate with the War Department. A consortium of public health and volunteer organizations was incorporated under the banner of the American Social Hygiene Association (ASHA). Combined, they represented “a hybrid of social purity and sex education movements” (Colwell 1998, 46). These organizations pressed for a means of educating the population about the facts of life, the evils of STDs and pre-marital (or extra-marital) sex. Yes, it was bad for the War Effort. However for ASHA, the spread of venereal diseases was due to ignorance within a paradoxical culture. On the one hand young people had been loosened from the corset-like confines of the Victorian moral order, yet on the other were still bound by obscenity laws that made it illegal to disseminate medical or preventative information regarding STDs.[27] A “conspiracy of silence” reigned: people did not speak openly of such things and had few available sources of information (Schaefer 1999, 21). The volatility of the subject was amplified by the controversies surrounding the birth control movement. As Margaret Sanger and her supporters discovered, it was still considered obscene and illegal in the United States to distribute materials addressing the use of prophylactics and other forms of contraception.[28] Some considered it more socially repellent to talk of venereal disease in public than to actually contract a case and not mention it (Schaefer 1999, 21). ASHA concluded that a range of sensitive and educational, yet frank and fearless, vehicles were required, aimed at both men and women to address the facts of life and some of their indelicate consequences. As the Red Cross had realised before them, the military and their new allies embraced the power of motion pictures to educate and inform the public of America. A motion picture offered an additional bonus: In the darkness of a movie theatre, no one can see you blush.

Their strategy involved dividing the audience along gender lines. Fit to Win (Edward H. Griffith, 1919) was crafted to appeal to the needs and experiences of young male recruits.[29] The End of the Road (Edward H. Griffith, 1919) was aimed at young women who, out of curiosity, ignorance or early “girl power” gone awry, were finding themselves in physically and morally perilous positions. The film was to be screened with a pre-show lecture by a medical practioner in “Ladies Only” sessions at public cinemas. In order to make the subject matter more palatable for a “delicate” female audience, the educational aspects were wrapped up in a charming love story.[30] Conveniently, director Edward H. Griffith had been drafted into the War Department’s Committee on Training Camp Activities. Previously engaged by Thomas Edison Company, and with acting experience of his own, he played a major role in the production of several films supporting the war effort. During his time at Edison, Lt Griffith had made four films with Claire Adams. She was cast as Mary, the heroine of this sensitive story, opposite renowned actor Richard Bennett, who had enjoyed an expansive and celebrated theatre career before taking to the screen.[31]

A key figure within ASHA, Katharine Bement Davis was brought in to write The End of the Road. Davis described her contribution to the project as having “been most carefully worked out in consultation with physicians on the side of fidelity to medical fact, and with teachers as to the psychological effect” (Colwell 1998, 48). Davis had an established career in education, penal reform and public health, involving intensive research into the social causes of delinquency, the efficacy of reform programs for female prisoners, as well as surveys into the nature of women’s sexuality.[32] Her work had inspired John D. Rockefeller Jr. to establish the Bureau of Social Hygiene as part of his newly established foundation (Encyclopædia Britannica Online, 2014).

The film was produced under the supervision of the Surgeon General of the US Army, a protective political arm supporting the enterprise. Additional collaborators included the National War Work Council, the YWCA, Rockefeller’s Bureau of Social Hygiene and the Famous Players-Lasky Corporation (American Film Institute Catalogue, 2015). This was a high-powered group of influential stakeholders with an interest in the film’s success. There was more than the usual amount of pressure and expectation placed on the actress playing Mary, the leading lady in this high-stakes drama. The following synopsis is from the American Film Institute Catalogue:

Mary, whose mother has instructed her about love, marriage and sex, leaves her boyfriend Paul to become a nurse in a New York hospital. Vera, encouraged by her mother to marry a rich man, takes an apartment from a young millionaire who promises marriage but only gives her syphilis. Mary and her doctor treat Vera and show her examples of the ravages of the disease. Mary meets other suffering women: an Irish servant girl betrayed by a chauffeur who dies after her baby is born; a garment worker who contracted syphilis from a soldier’s forced kiss; the invalid wife of a wealthy man whose philandering caused her condition, the blindness of her child and the suicide of another of his conquests. Paul, about to enlist, suggests that he and Mary have sex before her goes. Disappointed in him, Mary also rejects the proposal of her doctor, but later in Europe after seeing the kind of man he is, she accepts him.

On all fronts, this was a brave role for Adams. Some of the scenes featured actual patients scarred and suffering from venereal diseases, filmed on location at the women’s wards of Blackwell’s Island, New York City.[33] In her role as the nurse Mary, Adams is seen conversing with these obviously frail patients, gently supporting them as they present for the camera.[34] Most of the film was shot at the Rockefeller estate at Pocantico Hills in Mount Pleasant, New York State (Colwell 1998, 63). As a resident of New York City at this time, it would have been difficult for Adams not to be aware of the campaigns for and against women’s suffrage and access to birth control, including the incarceration and exile of Margaret Sanger, all of which received widespread publicity between 1916 and 1918 (Banner 1974, 103). There is also the possibility that she may have had first-hand experience with the physical, social and psychological impacts of these diseases during her training and duties in the obstetrics ward at Grace Hospital in Detroit. That possibility becomes more probable when considering her contact with repatriated soldiers in Winnipeg. Perhaps this experience galvanised her willingness to participate, in spite of the risks. Her participation in The End of the Road reflects the influence of her family’s military and public service background as well as her own compulsion to help others. She would require the strength of the first and her faith in the latter to confront the controversy this film was about to generate.

Figure 5: Claire Adams as the nurse Mary Lee with a patient in The End of the Road (Edward W. Griffith, 1918). Source: UC Irvine Special Collections and Archives.

Figure 5: Claire Adams as the nurse Mary Lee with a patient in The End of the Road (Edward W. Griffith, 1918). Source: UC Irvine Special Collections and Archives.

The spark that ignited a furore was the sudden end of the war. This timing meant everything. Audiences and the industry had had enough of both war and sex.[35] The military units and programs that sponsored the film’s production were disbanded. Several key elements and references of the storyline were no longer relevant or threatened to offend peace-time sensibilities. Davis, however, successfully argued that the need to educate young adults, particularly young women, about their bodies remained, and that this justified public screenings. The project was permitted to “continue until a clear opinion has developed concerning the desirability or undesirability of its being shown through commercial channels” (Colwell 1998, 66). The ASHA gained control of the copyright of the film. The distribution and exhibition remained (for the time being) under the auspices of the Surgeon General, with safeguards in place to protect the film “from sensational exploitation”. Women’s groups and other officials were invited to preview screenings to judge its value and suitability for local audiences (Colwell 1998, 64).

The film premiered on 16 February 1919 in Syracuse, New York. There were fifteen hundred people at opening night and it went on to capacity crowds in a number of other cities (Schaefer 1999, 28). Most of the film’s key personnel were not there to see it. By opening night, Griffith had gone west. Davis sailed to post-war Europe pursuing her new role as General Secretary of Rockefeller’s Bureau of Social Hygiene. There was only one participant left with any skin in the game who made herself available to face down the critics and attend the film’s premiere. Colwell notes: “For the record, actress Claire Adams, who played Mary Lee, did travel to both Syracuse and Pittsburgh to defend the film she had starred in” (66). The New York Times reviewed the film in March 1919 and enthused that The End of the Road was “the most valuable motion picture of its kind yet produced.” The picture is not pleasant or euphemistic. Neither is the subject with which it deals. It is unpleasant, however, only to the degree necessary for force, and plain spoken only to the extent necessary for clearness. It is never morbid. One feels clean after seeing it (“Opening the Road” 1919). Acknowledging that audience responses would depend upon their own moral convictions, The New York Times also reassured readers: “No film would be effective without competent acting and directing and it is an important virtue of ‘The End of the Road’ therefore, that it had both in its making … Claire Adams and Joyce Fair, in the two leading female roles, were attractive in appearance and intelligent in their interpretation of their characters” (“Opening the Road” 1919).

Despite positive support in the mainstream media, critics of the film began protesting as soon as it was released. As The New York Times forewarned, the acceptance of the film was divided along the lines of the individual’s moral view and tolerance of the story’s unique premise: it was not just “patriotic prostitutes”, “army flappers” or “camouflage dames” responsible for the spread of STDs. The film dared suggest that decent middle class men played an active part.[36] The End of the Road was the first film to present the case that STDs did not always originate in women of the lower classes, demonstrating that “syphilis and gonorrhoea are equal opportunity diseases” (Schaefer 1999, 33). This perspective contributed to establishing a new set of social and political battlelines, just as the Great War ended.[37]

The pendulum swung back to a more conservative side of society, basking in victory and longing to normalise culture and behaviour (Schaefer 1999, 34). As a government sponsored project, The End of the Road drew intense criticism, more so than other sensational films that had preceded it (Schaefer 1999, 29). ­­­­­­­­­­According to critics in Moving Picture World, copies of The End of the Road and Fit to Win had also “fallen into the hands of individuals who are allegedly exhibiting it to mixed audiences composed of men, women, boys and girls” (“Association Goes After ‘Fit to Win’”, 1919, 1141). Prominent church figures denounced the film, led predictably by the Catholic Church (Schaefer 1999, 32). By July 16, 1919, it was banned in Philadelphia. National Association of the Motion Picture Industry (NAMPI), so successful in their production of The Spirit of the Red Cross, succeeded by the end of 1919 in having the film banned across the United States. This reflects what Löhrer describes as “a specific American moral panic at play” (Löhrer 2011).

Although the film could no longer be shown in the US, The End of the Road travelled abroad.[38] In May 1920 it was exhibited in Canada under the auspices of the Canadian National Council for Combating Venereal Diseases and the principals of local colleges. Demonstrating her ongoing commitment to the work, Adams reached out to her fellow Canadians with a personal message printed in The Vancouver Sun, subsequently reprinted across Canada. She is an unbowed advocate, certain “that every girl in Canada could see this play”:

The message it conveys is one that society must learn, and I feel that this powerful drama is the most wonderful method of telling the important story to girls everywhere. I played my part with that thought in my mind, and in my heart the feeling that at least to the best of my ability I was performing a real service to womanhood (“Message to Girls Through Film By Claire Adams” 1920).

The End of the Road found welcoming and respectful audiences even further afield. There is evidence that the film was used as an educative tool in military training campaigns as far away as Vladivostok (McMaster 2014, 5). Over the next decade The End of the Road was also shown extensively across Australia, where Adams would spend the second half of her life. Promoted in The Sunday Times in October 1920 with the US plan in place to segregate audiences along gender lines, the film opened at the Sydney Town Hall on Saturday 6 November 1920 and played for 5 weeks. The Evening News described how it had been shown the previous week at a private screening for representatives from the clerical, medical and legal professions, and was granted approval to be screened by the Minister for Womanhood. A search of Australia’s digitised newspaper archives shows that the film travelled through major metropolitan and regional centres across the country, from Muswellbrook NSW to Charters Towers QLD, Clare SA to Katanning WA. There is little criticism of the film evident: rather it received endorsement from civic leaders, educators and the press across the country.[39] Adams received rich praise. In Melbourne it was shown at the Palace Theatre on Bourke Street where the Table Talk reviewer declared in March 1921, “Nothing better has been shown on the screen than the perfectly moulded features of Claire Adams.” When The End of the Road reached Hobart in June 1921, The Mercury reported that over 5,000 had already seen it in Launceston. The reviewer from The Advertiser in Adelaide, who had seen the film at the Adelaide Town Hall, enthused in July 1921 that “The picture defies description. There is a touch of genius in it. The screen has never disclosed a purer or more impressive lesson.” The following week, the same paper discussed how several members of the public showed great interest in Adams, enquiring after her background and identity. They make much of her youth, talent and beauty, but pay particular attention to her patriotic service during the war and her nursing experience at Grace Hospital in Detroit.

The film continued across Australia until July 1928, when The Goulburn Evening Post reported that the reels had been stolen in Broken Hill. By that time, however, Adams had retired from her successful Hollywood screen career. One decade later, Claire Adams migrated permanently to Australia, stepping off the ship in Melbourne on Valentine’s Day in 1938, where there may still have been more than a few moviegoers who remembered her work. There is evidence to suggest that Claire Adams Mackinnon had a copy of both The Spirit of the Red Cross and The End of the Road with her when she came in Australia. It is not known, however, if they were shown beyond the bounds of her new home, or if those who saw them during one of her movie nights had any idea of the films’ impact during and in the aftermath of the First World War.[40]

Figure 6: Claire Adams as Justyn Reed, the fiancé of Jim Apperson, played by silent screen idol John Gilbert in The Big Parade (King Vidor, 1925). Image ©Warner Bros. Source: Author’s private collection.

Figure 6: Claire Adams as Justyn Reed, the fiancé of Jim Apperson, played by silent screen idol John Gilbert in The Big Parade (King Vidor, 1925). Image ©Warner Bros. Source: Author’s private collection.

Adams’s performance in The End of the Road drew the attention of producers in Los Angeles and New York, where they were deep in debate regarding proposed censorship measures within the motion picture industry. In 1920, independent producer Benjamin B Hampton, who had also seen her work in The Spirit of the Red Cross invited her to Hollywood to star in his productions (Lake 1922, 51). They would marry in 1924. By the time she retired in 1928 she had starred in over forty feature films alongside some of the most popular and defining artists of the era. These included Tom Mix, Rin Tin Tin, Jean Hersholt, Lon Chaney and Clara Bow. Adams’s also appeared opposite John Gilbert in his break through role in The Big Parade (King Vidor, 1925), noted in the AFI catalogue as being “frequently described as the most successful silent film of all time”. Adams played Justyn Reed to Gilbert’s doughboy Jim Apperson. Justyn is a blithe ingénue, somewhat reminiscent of Adams’s flirtatious character Nina from The Man-eater, gaily encouraging her fiancé to enlist so she can see how handsome he’d look in uniform. These characters, both Nina and Justyn, could well have been constructed from characteristics of the sexually liberated yet innocent/ignorant young women who were the target audience for The End of the Road. As such, The Big Parade is a strangely coalescent bookend for her career, which practically began and ended with blockbusting war films.


The value of the Claire Adams films discussed here is lost when they are removed from their historical context. They are very much a product and reflective of the era in which they were produced, which was both defining and tumultuous. The confluence of war, the women’s independence movement and the development of the motion picture industry offered women a new degree of independence and a more active role within their own lives as well as within their culture, both as consumers and producers. This relatively small period from Adams’s career encapsulates those opportunities, as well as the threats and challenges available to women in general and within the motion picture industry in particular. She is a mediator between the realities and the fictions; a nurse playing a nurse addressing current issues both on and off-screen, lending more than a modicum of verisimilitude to the productions. Her characters in The Spirit of the Red Cross and The End of the Road were strong female protagonists, created to inspire audience members and advocate for action, encouraging all women to change their lives and in doing so play an active role in changing their world.

Amidst the international commemorations of the centenary of The Great War, it is timely to assess Adams’s life of service in front of the camera and behind the scenes, not only in Australia, but also for our allies in the conflict – Canada and the United States of America. She was, after all, a proud citizen of each country at different stages of her life. That her achievements have been forgotten is also symptomatic of the times. The role of women in the motion picture industry has been undervalued and overlooked for most of the previous century, so the “loss” of Claire Adams from film history is not unusual. The scale of public awareness generated around the superstars of Silent Hollywood, fed by the scandals, stereotyping and celebrity culture that engulfed the industry, has diminished the memory of her career, especially if it is measured and evaluated in terms of current and historic memory. My biography of Adams (in progress) explores her significance, of the difference she made as an actress and as a private citizen, and how she never lost the sense of public service first demonstrated in the choice of roles discussed here. Indeed, Adams continues to support community causes, including the Australian Red Cross, via a substantial trust established after her death in 1978.[41] By avoiding the superficiality and pressures of the studio system, Adams may have missed out on Hollywood immortality. However, her personal and professional choices demonstrate that she was not overly concerned by the vagaries of popular opinion. Her priorities were always much closer to home.




[1] Adams’s petition for US citizenship document confirms that she was a subject of the British Dominion of Canada and that her last foreign address was in Toronto. From there she entered the US permanently at Detroit on 12 June, 1915. She first declared her intention to become a US citizen in September 1922, two years before she married Hampton. Her petition for citizenship was granted in Poughkeepsie on 4 February 1931, having resided in Pawling, NY since 1 September 1928.

[2] Willis Goldbeck in Motion Picture Classic (1920) described Adams as “instinctively Britishly reserved, a person whom one cannot hope to know in a day, or a month.”

[3] As of this date, I have yet to find Claire Adams on the cover of either fan or trade magazines, though several profile interviews and letters to editors indicate she had a fan base. Without a studio publicity department behind her, and as part of a company that promoted all-star casts, she was able to enjoy a more private life and manage her own profile accordingly, hence my concept of her as an independent artist.

[4] Originally published as A History of the Movies, by Covici, Friede, New York, 1931, Hampton’s work was reissued in 1971 as History of the American Film Industry from its beginnings to 1931, edited with an introduction by film historian Richard Griffith.

[5] In his introduction to the second edition of Hampton’s book, Griffith asserted “That it is the best history of the movie business to date there can be no doubt’, and positions it in terms of authenticity and objectivity above Terry Ramsey’s A Million and One Nights (1926) and Lewis Jacobs’ The Rise of the American Film (1939).

[6] As described by Monica Dall’Asta and Victoria Duckett in “Kaleidoscope: Women and Cinematic Change from the Silent Era to Now.” 2013, 8.

[7] At only fifteen years of age, Gerald ran away to join the Scottish Highlanders, but was dragged back to London by his Great Aunt Mabel Adams with whom he was staying. As soon as he was old enough he enlisted, but had only just completed training when the war ended. He never saw any action on the front. Nancy Miley,“Gerald Drayson Adams (1900-1988)” from Adams Family History Notes, Unpublished Manuscript.

[8] This prospectus was written late in the 19th century. The Grace Hospital School of Nursing (GHSN) offered two years of vocational (if somewhat rudimentary) instruction during which time students would be under constant supervision, on and off duty. Students were permitted one afternoon off per week and were encouraged for the sake of their own health to spend one hour each day in the open air. They were allowed two weeks leave per year, one evening off per week and required permission from the school’s Principal to be out later than midnight. In return, the students would receive training in elements of hospital work, namely the dressing of wounds and burns, applications of fomentations and poultices, making beds and methods for avoiding bedsores.

[9] Adams’s name on all official documents was her birth name, Beryl Vere Nassau Adams, but this does not show up on the list of graduates either.

[10] According to the Red Cross Canada website, “Red Cross nurse” was a term applied to anyone volunteering in a hospital in any capacity, including letter writing and visiting. Millions of young women contributed to this international effort and there was simply not the capacity to establish or maintain complete records of every volunteer. “About the Canadian Red Cross”, Accessed 15 August 2014,

[11] A New Set of Needs, from the website of Veterans Affairs Canada, describes the inadequate hospital infrastructure, and venues converted to repatriation hospitals following England’s requests to the Dominions to take care of their own wounded. Accessed 11 January 2015,

[12] According to the American Film Institute Catalogue entry, Barnaby Lee was probably not widely released.

[13] Scouting for Washington, Wild Arnica and Shut Out in the Ninth were viewed by the author at the Library of Congress, Washington D.C.

[14] Flagg asserted that his Uncle Sam became “the most famous poster in the world”. As quoted on the American Treasures of the Library of Congress website, accessed 13 February 2015,

[15] The Man-eater was viewed by the author at the Library of Congress, Washington D.C.

[16] The other lead roles in the “Girls You Know” series were played by Dorothy Wallace, Mary Arthur, Martha Mansfield and Peggy Hopkins. According to IMDb, Hopkins had appeared in 2 Columbia productions. Dorothy Wallace had one previous film credit. Martha Mansfield had four previous motion picture credits and seemed destined for a successful career. In 1920 she starred opposite Lionel Barrymore in Dr Jekyll and Mr Hyde (John S. Robinson, 1920). Mansfield died in horrific circumstances in 1923 after her period costume caught fire on set. See IMDb

[17] Banner suggests that it was considered “a disgrace for a woman’s name to appear in public print” though this began to change before the war (20).

[18] According to Nancy Milley, her father Claire’s Uncle, Ernest Adams and her Grandmother, did not approve of Claire’s profession, even into the 1920s when she was at the height of her career. “Ernest Dupin Adams” from Adams Family History Notes, unpublished manuscript, author’s private collection.

[19] Several scenes were shot in the Jackson Barracks in New Orleans, which was in use as a training and processing facility for the United States Army. This suggests that recruits may have been used as extras in the film, perhaps to add a layer of verisimilitude to the scenes of battle and troop movements. From website Hollywood on the Bayou, Louisiana Film History by Parish,

[20] The Red Cross War Council subsequently established a “Bureau of Pictures” in early 1918 “to tell upon the screen the splendid story of the Red Cross”. These single reel shorts featured such titles as “Broken Lives”, “Victorious Serbia” and “Russia: a Land Worth Saving”, reflecting their ongoing work in Europe and with veterans up to and following the Armistice. From “Splendid Work of American Red Cross Society Graphically Told in Series of Single Reel Motion Pictures Now Ready for Screen”. 1918. Exhibitors’ Trade Review. December 7. Vol 5. No. 1.

[21] In the El Paso Herald, for example, the proprietors of the Grecian Theatre placed a large advertisement, paid for by another local business, outlining a special “Red Cross War Fund Benefit”, from which they would donate all proceeds from the evening’s entertainment to the cause. Patrons were encouraged to “make your quarters jingle” at the box office as “Lives Over There” depended on their support. Wednesday 15 May 1918, 5.

[22] The release of the film was also accompanied by a nation wide door knock appeal (American Red Cross, 2014).

[23] The World War 2 US Medical Research Centre estimates that during the Great War, venereal disease “had caused the Army lost services of 18,000 servicemen per day.” Accessed 29 January 2015.

[24] As described by Banner, early feminists saw male sexuality as being at the heart of female oppression. Even before the war, early feminist Inez Milholland wrote that “we are learning to be frank about sex … and through all this frankness runs a definite tendency toward an assault on the dual standard of morality and an assertion of sex rights on the part of women” (116).

[25] According to Schaefer, “Discourses on venereal diseases and eugenics were so tightly intertwined as often to be inseparable” (21).

[26] The irony is that throughout the nineteenth century, prostitutes were seen as a necessary evil in order to protect the virtue of pure women. The North American sex industry operated in relatively structured or semi-licensed conditions where the workers were regularly checked for STDs (Banner 1974, 76). The change came in the early twentieth century when sexually active young women identified sex with strangers as a possible way out or momentary escape from the drudgery of their working class lives. If they accepted payment for sex, they could earn two to three times more than they could as a domestic servant or salesgirl (Banner 1974, 81).

[27] Schaefer laid out the confluence of concerns that surrounded the spread of STDs in American society, which included Eugenics, Industrialization, Abstinence and Prohibition. The constituents incorporated under the ASHA banner represented most of these concerns. Progressives were concerned about the political and social boundaries being blurred by the urban, industrial and technological developments of the early Twentieth Century (18 – 23).

[28] According the Katz, Sanger had already experienced exile from the US in 1914 when in 1916 she was arrested and imprisoned for opening the nation’s first birth control clinic. In 1917 she had also made a motion picture titled “Birth Control” which had been confiscated by New York authorities.

[29] Fit to Win was shown at training camps where a lecturer would be available to explain some of the finer points of the storyline and address any medical questions.

[30] The film’s writer, Katherine Bement Davis, had already conducted extensive research into the possible impact of sex education on young women, which contradicted prevailing Freudian based theories that too much knowledge of sexuality and its consequences could permanently damage young women (Colwell 1998, 60).

[31] In 1914 Bennett made his movie debut with Damaged Goods, a film adapted from the theatre script that “pictures the terrible consequences of vice and the physical ruin that follows the abuse of moral law.” Bennett had married his co-star from Damaged Goods, Adrienne Morrison, with whom he had three daughters, two of which would go on to major Hollywood careers: Joan and Constance Bennett.

[32] Banner describes Davis as a leading figure of the Progressive movement, instigating major studies and reforms in the fields of urban poverty, women’s prisons and sex education (Banner 1974, 98). The New York Times ran a major profile of Davis when she was named New York’s first female Commissioner of Correction in January 1914. See Edward Marshall’s “New York’s First Woman Commissioner of Correction”, The New York Times, 11 January 1914.

[33] The location where these scenes were shot is mentioned in the notes section for the catalogue entry for The End of the Road in the American Film Institute Catalogue accessed 29 August 2014, Schaefer describes how many of the more confronting scenes of open wounds were cut in order to please the censors and keep the film in theatres, but some scenes were later reintroduced by exploitative distributors (29-30).

[34] The End of the Road was viewed by the author at the Library of Congress, Washington D.C.

[35] The issues facing the film were compounded by the film industry’s wish to appear respectable and beyond moral reproach in the face of a growing amount of sex scandals and exploitation films (Schaefer, 30). There was also an acceptance of sex education taking place within schools, making the role of sex education films in public cinemas largely redundant (Colwell, 69).

[36] According to Colwell (71), there were around ten other films addressing the spread of STDs, however they depicted women as the source of infection and transmission. The End of the Road differs significantly by drawing attention to the complicity of men.

[37] Schaefer notes that Damaged Goods (Thomas Ricketts, 1914) also starring Richard Bennett was the first to have established this premise on film. Damaged Goods reinforced the claim that venereal diseases were the scourge of the lower classes and inflicted upon young men of means and family during moments of weakness caused by drunkenness or deliberate seduction by fallen women from the lower classes (23).

[38] The copy viewed by the author in the Library of Congress was in Dutch, indicating it was intended for audiences in the Netherlands and other Dutch-speaking colonies.

[39] This section could not have been conducted without the use of Trove, the National Library of Australia’s online database of Australia’s newspapers. The articles directly referenced are noted in the above reference section.

[40] The Mooramong Buildings and Structures: Conservation Analysis Report for the National Trust of Australia (Victoria) of July 1989 contains a list (Appendix A) of 48 motion pictures “compiled from film reels held at Mooramong”. The fate of these films is unknown.

[41] According the Geoff Hone, the Scobie and Claire Mackinnon Trust have donated approximately AU$2 million to the Royal Children’s Hospital as well as other significant amounts over the years to the Australian Red Cross.


Bio: Heather L. Robinson is a Research Associate & PhD Candidate in the School of Humanities and Creative Arts, Flinders University. She is also an Honorary Research Associate (History) at the Los Angeles Natural History Museum.

Days of YouTube-ing Days of Heaven: Participatory Culture and the Fan Trailer – Kyle R. McDaniel

Abstract: This study analyzes the aesthetic content and user-generated feedback of fan-appropriated film trailers exhibited in on the Internet. The aim of this research is to gauge participatory culture’s involvement in the transformation of promoting archival motion pictures on the Internet. This research study looks to fan trailers as unique media entities that exist as visually empowered narratives created through specific acts of fandom. Specifically, this study investigates the audiovisual and discursive elements of competing trailers for Terrence Malick’s Days of Heaven (1978). The findings suggest that fan trailers are capable of generating myth and nostalgia for aging motion picture properties through user-generated acts. The broader goal of this project is to understand the relationship between participatory film cultures and studio-controlled motion picture content available on video streaming and sharing media channels.

Fig. 1: The memorable and Biblically referential swarm of locusts in the film Days of Heaven.

Fig. 1: The memorable and Biblically referential swarm of locusts in the film Days of Heaven.

Introduction: Trailers at a Turning Point

A YouTube video by an unknown director can suddenly blow up on the marketplace, and there will be three studios bidding for it. (Without having yet met the director!)…Maybe execs are busy watching YouTube instead of hearing pitches. Our work is virtual.   

-Lynda Obst, Sleepless in Hollywood (2013, 27).

In April 2014, an online user released a high-definition film trailer on YouTube for David Fincher’s forthcoming thriller, Gone Girl (YouTube 2014a). Several hours after the trailer’s debut, an impressive 186,000 fans had accessed the content with 276 of that number contributing written feedback to the message board on the webpage. While film fans were sharing interest and excitement for the trailer on YouTube, News Corp., the media entity that financed Gone Girl through 20th Century Fox, perceived a threat of digital piracy. The following day, the conglomerate removed the trailer and the fan commentary. In the absence of this content, News Corp. left a statement reading, “FOX has blocked [the trailer] on copyright grounds” (YouTube 2014b). This incident is representative of the contemporary state of affairs between media conglomerates with a controlling interest in motion pictures and film fans in online spaces. The presence of film trailers on the Internet presents a specific set of issues for both parties as well, especially in relation to film marketing and promotion, in addition to content ownership and control over copyright.

This study engages with how film fans interact with once-profitable motion picture properties through fan trailers on the Internet. Here, the fan trailer is defined as the act of re-editing and re-exhibiting abridged film content through online channels. Fan trailers are realized through specific and largely collective acts of user-participation, and have the potential to revitalize interest in aging film properties. This article explores the audiovisual and content-related aspects of fan trailers in comparison to a distributor-owned trailer for Days of Heaven (Terrence Malick 1978). Furthermore, the feedback or commentary on message boards is also investigated as part of this research project to locate how such discourse speaks to the collective memory of Hollywood archives. In order to understand the issues surrounding the emergence and popularity of the broad spectrum of Internet trailers, this study looks to literature on the relationship between the evolution and of fan involvement with digital cinema and new media, as well as scholarship on the history of film trailers and film promotion and advertising. The findings from this article suggest that fan trailers play a crucial role in continuing the lifespan of aging Hollywood properties or archival films. The proliferation of fan trailers through video streaming and sharing websites as well as the message board commentary suggests that fan participation is instrumental to building relationships between film and viewer. In turn, participatory cultures that interact with older film titles in online channels incorporate aspects of their public and private selves as part of this creative process. The following research questions are designed to further explore this relationship between film fans inhabiting online spaces and the evolving state of fan trailers in digital cinema: What are the content-related (i.e., audiovisual) similarities and differences between the distributor-controlled, official trailer and the fan trailers under study? And what role(s) does user-generated commentary or feedback play for these trailers?

Fig. 2: A black-and-white still of Terrence Malick on the set of the film.

Fig. 2: A black-and-white still of Terrence Malick on the set of the film.

Film Promotion in the Digital Age: New Strategies, New Rules

For much of the 2000s, Hollywood was reluctant to promote film content through online channels for fear of losing theatrical and home video revenue (Sickels 2011a). The film industry seemed confused by the ever-growing presence of the Internet and related online technologies for film exhibition. But to effectively reach a global audience, the studios and their parent media conglomerates were eventually forced to adapt to the changing media landscape. As Sickels (2011) stated: “Deals with Netflix and the like are only going to delay the inevitable…Audiences don’t want to wait, and they certainly won’t when their only reason for having to do so is an artificial time structure concocted by the studios…”(145). By the second decade of the century, the industry’s fears had become a reality, with on-demand film and television viewing radically altering the industry.

Scholars have pointed to the different complexities of film marketing in the digital age and the associated challenges for the U.S. film industry (e.g., Cunningham and Silver 2013). In Perren’s (2010) words, “A wide range of economic, cultural, political, and formal factors are at play; different entities have distinctive stakes in online distribution” (77).  In other words, films with a greater potential to appeal to a global audience receive preferential treatment from media conglomerates, as well as promoters, marketers, and distributors. With video-on-demand (VOD) revenue climbing steadily since 2010, the studios are looking to different methods for advertising motion pictures beyond the more traditional formats, which includes one-sheets of film posters and theatrical and television spots (Roxborough 2013). Film trailers on the Internet are a viable option in this evolving landscape. The Internet Movie Database and YouTube are the most frequently visited websites supporting online film trailers, with both entities supporting numerous trailers for new releases and older Hollywood titles. In effect, the spectrum of film trailers on the Internet presents a number of potential issues for the film industry. Trailers, historically controlled by studios for advertising and publicity purposes, are increasingly pirated by outside entities. One scholar argues that film industry insiders are the ones largely responsible for leaking studio-controlled content online, with the availability of illegal anti-encryption and watermarking software to bypass copyright restrictions playing a role as well (Bettig 2008, 200-201). Since the release of the DVD De-Content Scramble System (DeCSS) in 2002, film content has been descrambled and decoded for public access and use, despite the studios efforts to control motion picture content (Litman 2002).

Film fans, however, have argued that such laws overwhelmingly favor those with a financial stake in motion picture properties, thereby inhibiting individual and collective acts of creative expression (Boyle 2008). As such, studio-backed restrictions have resulted in more frequently cited instances of pirated motion pictures as well as an upsurge in websites devoted to streaming and downloading studio-owned film content (Sterbenz 2014). Scholars and journalists reporting on the film industry have addressed some of these issues in relation to film trailers. For instance, Rothman (2014) discussed how theatrical trailer standardization discourages user interactivity. Tolson (2010) reported that fan participation with film content suggests an increase in technological “play” that disrupts the traditional model of media production to consumption. Others have looked at how trailer “mobility” is encouraged in a cross-platform media environment, and the effects of contemporary trailer length and message on the viewer (see Franich 2013; Johnston 2008). While many of the issues surrounding film promotion in online spaces remain unanswered, trailers continue to serve as a primary marketing tool for motion picture studios and their parent conglomerates. Fan involvement with film trailers is a burgeoning area of contemporary film marketing and new media, but scholarship on this subject is lacking. Therefor, how participatory cultures connect to older film titles in online spaces through the fan trailer remains an unexplored avenue of study for cinema and media scholars.

Fig. 3: The film’s main titles are appropriately positioned in the concluding seconds of the Paramount Movie’s YouTube-exhibited trailer.

Fig. 3: The film’s main titles are appropriately positioned in the concluding seconds of the Paramount Movie’s YouTube-exhibited trailer.

Trailers in Transition: A Brief History and Contemporary Definitions

The most time-honored marketing strategy for film promotion is the movie trailer, commonly referred to as the “preview.” Kernan (2009) traced the genealogy of film trailers to 1919, citing the National Screen Service (NSS) as the first unified company responsible for creating these advertising spots. The author asserts that the evolution of the film industry during the 20th century affected changes in the types of motion pictures produced, thereby altering the aesthetics and meta-messages of trailers in the ensuing decades. A transition in film marketing occurred during the 1970s, and then again in the 1980s, with a rise in independent filmmaking, an upsurge of art-house theaters, and eventually, the summer blockbuster. During these decades, films trailers debuted on network television in thirty-second spots, visually supported by moments lifted from the film, and complete with the now-familiar and once-prominent voice-of-God narration. By the contemporary era, trailers had become “unique form[s] of narrative film exhibition, wherein promotional discourse and narrative pleasure are conjoined (whether happily or not)” (Kernan 2009b, 1). In essence, this period saw the rise of distinct promotional film advertisements alongside the audience’s familiarity and ability to detect such media forms.

Scholars regard the modern film trailer as both complex and historically shifting media type. A leading scholar on the history and transition of motion picture trailers suggests that these forms are specifically targeted, easily recognizable visual media that are created to capture, direct, and guide viewer attention (Wyatt 1994). Today, both media entities and online film fans aid in determining trailer standards and audiovisual elements. Trailers are guided by audiovisual messages through structured narratives to connect with the largest number of viewers through multi-platform distribution. Some have argued that film trailers in the digital era are defined by their dynamic if fleeting presence, asserting that contemporary trailers are forced to compete with other media forms to encourage audience-driven participation or feedback (see Rombes 2009a). Smartphones and digital tablets indicate an increase in trailer mobility and interactivity on behalf of audiences, who are receiving different media in shorter, eye-catching bursts (Grainge 2011).  Scholars have also argued that the efforts of fans on the Internet extend film capital beyond traditional home video or cable and network replay through film mashups or distributing abridged content (e.g., Sickels 2011c; Hoyt 2010a). Tyron (2009) traced the inception of the digital movie trailer to a fan preview for The Shining (Stanley Kubrick 1980) that gained Internet traction the same year as the inception of YouTube. According to the author, the fan trailer was an outgrowth of DVD culture “that allowed viewers to recognize that texts were ready to be ripped apart and reassembled in playful new ways” (151). Lazzarato (2006) described these types of fan creations as influential because they are “activities involved in defining and fixing cultural and artistic standards, fashions, tastes, consumer norms, and, more strategically, public opinion” (132). In sum, film fans use popular film properties to engage with and further promote such content to a wider range of consumers.

Re-appropriating and exhibiting film content is oftentimes understood as a group effort. Rose (2012a) argues that the cyclical discourse that occurs in online social networks encourages is what engages users to interact with film properties. Citing Avatar (James Cameron 2009) and The Lord of the Rings (Peter Jackson 2001-2003) trilogy as examples, the author maintains that a strong and relatable narrative or story is of the key to fan involvement. According to Rose, online visual narratives must be able to entertain as well as challenge participant-viewers, thereby encouraging individuals to take part in the creative act (233). Through user-participation and online media channels, the modern film trailer appears in transition. In an environment increasingly dominated by new media platforms and social networking, video-sharing websites are stimulating the development of relationships among social actors.

Defining Participatory Cultures and Digital Cinema

Participation raises the question of whose story is it? And, the answer I think is, it’s all of ours. In order to really identify with the story, in some way we have to make it our own.

-Frank Rose, The Art of Immersion (2012b).


Online users are now affecting many aspects of the motion picture industry and most recently, have turned to collaborative involvement with film trailers. Through an increasing number of video streaming and sharing websites, fans are producing and exhibiting short and hybrid motion picture forms from existing film content. Jenkins (1992) defined networked individuals who engage with and repurpose existing media materials as members of participatory cultures. These persons “speak from a position of collective identity, forge an alliance with a community of others in defense of tastes which…cannot be read as totally aberrant or idiosyncratic” (23). The author attributed the roots of this phenomenon to fan communities that built up around popular television programs, such as Star Trek, and who communicated and bonded through sharing information at conventions and fan clubs. More recently, Jenkins (2006a) has adapted his definition to include new media and social networking. Although optimistic about the endeavors of participatory cultures, Jenkins has noted the drawbacks of these communities as well, including the shifting power dynamics of group members and the involvement of corporate entities. In addition, the author has described the illegal activities of some members of participatory cultures, specifically those parties who undermine media conglomerates through acts of digital piracy and copyright infringement. Jenkins (2006b) has also commented on the burgeoning relationship between participatory cultures and digital cinema:

[I see] media fans as active participants…seeing their cultural products as an important aspect of the digital cinema movement. If many advocates of digital cinema have sought to democratize the means of cultural production and distribution to a broader segment of the general public then the rapid proliferation of fan-produced Star Wars films may represent a significant early success story for that movement (551-552).

In other words, the upsurge in digital cinema is dependent on fans in much the same way that fans are dependent upon interacting with cinematic creations. Digital cinema, as such, is oftentimes described as an outgrowth of online fan participation. Rombes (2009b) claims that collective acts of nostalgia, personal expression, and the adaptation of new technologies play a role in shaping digital cinema. Beginning with the rise of digital video and cinematography in the mid-1990s, the author discusses an additional factor in the relationship between digital cinema and the actions of participatory cultures: “There is a tendency in digital media – and cinema especially – to reassert imperfection, flaws, an aura of human mistakes to counterbalance the logic of perfection that pervades the digital” (Rombes 2009c, 2). In consideration with Rose’s (2012) insistence on powerful storytelling, Rombes argues that digital cinematic forms are generated and desirable because of factors such as pixilation and noise, which appear to mirror human imperfections. While fan intervention in existing film content raises questions for the future of digital cinema and a general understanding of what constitutes motion picture archives, participatory cultures have contributed to film marketing and promotion since the late 1990s. According to Erickson (2009a), who studied Internet film campaigns for The Blair Witch Project (Daniel Myrick and Eduardo Sanchez 1999) and others, studios appropriate fan-based advertising strategies if fan efforts prove financially successful.  This article is concerned with how participatory cultures repurpose and interact with the content of older motion picture titles. The entrance of fan trailers through online video streaming platforms suggests new territory for digital cinema, as well as the possible extension of the lifespan for archived film properties.

Fig. 4: A still image from the opening titles of a student-generated video essay for Days.

Fig. 4: A still image from the opening titles of a student-generated video essay for Days.

Case Study Film: Days of Heaven

Since it was first released, “Days of Heaven” has gathered legends to itself…[it] is above all one of the most beautiful films ever made. Malick’s purpose is not to tell a story of melodrama, but one of loss. His tone is elegiac. He evokes the loneliness and beauty of the limitless Texas prairie.

-Roger Ebert (1997a).

In the contemporary media marketplace, conglomerates and studios overseeing film distribution and exhibition pay close attention to the role of technologies in film promotion and branding. This is also true when considering how older film titles are released, with potential revenue gained from cable and network television broadcasts, DVD rentals and sales and most recently, VOD. Those with a financial stake in film archives oftentimes publicize and rerelease only a select number of dated film titles per year, with those properties having the most commercial potential regarded as particularly valuable on the marketplace. While some noteworthy and popular motion picture titles are available for little-to-no pay through video-sharing online services, media conglomerates use Netflix, Hulu, Amazon, and iTunes, for instance, to promote their most commercially viable films. It is here that the role of participatory culture and the evolution of the fan trailer in the archival value of film properties must be taken into consideration. Days of Heaven is significant because of its longstanding popularity amongst fans, its continual re-emergence in the public arena, and its location in cinematic history. Malick’s film arrived at a turning point in the New Hollywood of the 1970s. The competition between fledgling studio productions and a burgeoning independent film movement marked much of the decade’s releases (see Thompson and Bordwell 2010; Biskind 1998, et. al.). “But by the late 1970s,” Thomson (2012) writes, “there began to be fewer grown-up pictures meant to disturb and provoke” (459).

Before and after its release, Days of Heaven was considered an oddity for Paramount Pictures, a none-too-profitable feature that rested on the short reputation of its filmmaker.[1] Malick spent his early years in Hollywood penning several projects for other directors until his first feature-length film, Badlands (Terrence Malick 1973), gained traction from both audiences and critics, garnering a reputation as the second Bonnie and Clyde (Arthur Penn 1967). Patterson (2007) said Malick’s film offered the director the chance to “work outside more conventional parameters” (28). The filmmaker’s follow-up, however, was grander in scope and presented to audiences as a thematic American period piece. Set in the Great Plains of the 1910s, the narrative focused on a romantic amongst two migrant workers and a land baron. Morrison and Schur (2003) described Days as “wed[ding] Whitman’s poetic ideal of the democratic vista to the interior landscapes of Henry James, with a plot that evokes The Wings of the Dove and ends with a quasi-biblical plague of locusts” (23) [Fig. 1]. Indeed, the locusts were memorable, as was a lengthy scene in which wildfire spreads rapidly across the grasslands, scorching a vast swath of farmland. But much of the film’s storyline involved the happenings of Malick’s starring quartet – Richard Gere, Brooke Adams, Linda Manz, and Sam Shepard – with the characters’ muted emotions drawn out in close-ups paired with character voiceover.

Fig. 5: Gere and Adams’s characters traveling atop a railcar with other migrants in the film.

Fig. 5: Gere and Adams’s characters traveling atop a railcar with other migrants in the film.

Much of the film’s legend was only realizable years after its release. For one, Malick departed from filmmaking for two decades after Days, leaving a questionable legacy for a motion picture whose long-term stability rested on the director’s reputation and the film’s much-discussed cinematography. Over time, those perfectly composed images of man and nature, or what Kehr (2011) glowingly referred to as, “aesthetic shock effects [that] create vast, harmonious wholes,” were responsible for keeping the film in the minds of journalists and cinephiles (23-24). The film’s cinematography eventually became something of Hollywood lore [Fig. 2]. Ebert (1997b) detailed the infighting between credited director of photography, Nestor Almendros, and his predecessor, the notoriously cantankerous Haskell Wexler, in his “Great Movie” review of the film. Over the years, rumblings over credit for the look and feel of the film have led to a reconsideration of the man responsible for capturing such well-regarded images. In the years since its release, Malick returned to filmmaking and has garnered generally favorable reviews and some commercial success.[2] No fewer than ten book-length volumes are dedicated to the filmmaker’s resurgence, including The Terrence Malick Handbook (Smith 2012), and a number of academic and trade journal entries have surfaced on the canonical worthiness of Days (e.g., Crofts 2001; Woessner 2011; Koehler 2013, et. al.). Not surprisingly, praise and frustration for the film reigns on the Internet as well. The number and popularity of video clips available on video streaming and sharing websites suggests additional enforcement of the scholarly and journalistic discourse devoted to the film as well. While Days remains a much-debated and discussed film more than 35 years after its theatrical release, the role of trailers for the film on the Internet deserves attention in the era of cross-platform film promotion.

Selection of Trailer Case Studies: The Presence of Days of Heaven Online

The “Paramount Movies” channel on YouTube, overseen by Viacom, offers an original trailers for Days of Heaven [Fig. 3]. The Criterion Collection, responsible for marketing and distributing the Blu-ray and HD-DVD versions of the film, also displays an official trailer on its homepage for Days.[3] Mysteriously, Paramount’s trailer has received few visitors on YouTube while Criterion’s showcases an impressive 153 user-generated comments. The seeming randomness of attracting viewers to trailer content in online spaces is represented in this brief comparison, which appears to crossover to fan trailers as well (YouTube 2014c; The Criterion Collection 2014). The volume and popularity of fan trailers and video clips of Days showcased on YouTube overshadows this corporately controlled material in several ways as well. For one, the power of the video sharing website’s status as a social networking outlet is immediately evident. The “WorleyClarence” YouTube channel, for instance, has reposted an official version of Paramount’s trailer with an astonishing 360,000 views and 97 message board posts.[4] “JokerTreePictures,” described as an umbrella channel for three student filmmakers, has created a seven-minute video essay for Days that has gathered significant attention [Fig. 4]. Another YouTube user offers a promotional video compiled from scenes from Days matched with the music of Rod Stewart’s pop single, “Broken Arrow.” The sum of this content, which includes fan-exhibited interviews with the cast and crew as well as scenes lifted from the film, is evidence of the film’s presence on the Internet (YouTube 2014d).

For this study, three trailers were chosen as individual case studies based on the following criteria: 1) the recognizable differences in their audiovisual content, 2) the number of online views (i.e., “hits”), and 3) the number of message board posts or available online feedback. Two fan-appropriated trailers exhibited on YouTube were selected based on these requirements, as was the aforementioned trailer available through The Criterion Collection. The necessity of the trailer selection process was to compare and contrast elements of fan trailers with an official trailer approved by a media outlet in an effort to answer the research questions for this study. Many trailers that did not meet the research criteria were not selected because of factors such as conflicting content with the selected trailers, a lack of available user-generated discourse on message boards, and/or the number of recorded views or hits online. After completing the selection process, trailers were coded A (“WorleyClarence” YouTube Channel), B (“cnharrison” YouTube Channel), and C (The Criterion Collection), respectively. The researcher conducted individual and comparative audiovisual analyses on trailers A, B, and C and made notes on narrative structure and trailer content. This was followed by a qualitative content analysis of the online commentary or feedback on the message boards for each trailer’s webpage. In effect, the trailer selection process and resulting analyses were guided by the research questions for this study: What are the content-related (i.e., audiovisual) similarities and differences between the distributor-controlled, official trailer and the fan trailers under study? And what role(s) does user-generated commentary or feedback play for these trailers?

YouTube. 2008. “Days Of Heaven – Trailer (1978).” Last modified April 17, 2008.

YouTube. 2013. “Days of Heaven – Trailer.” Last modified on April 13, 2013.

YouTube. 2013. “Days of Heaven–A Video Essay.” Last modified on October 16, 2013.

Fig. 6: Adams and Shepard photographed in silhouette, with the symbolic farmhouse looming in the background.

Fig. 6: Adams and Shepard photographed in silhouette, with the symbolic farmhouse looming in the background.

Days of (Online) Fan Trailer Heaven

Trailer A opens with an image of Paramount Pictures’ trademark logo. The studio’s signature emblem fades into an image of brooding clouds looming over a wind-worn prairie. Thunder bellows on the soundtrack, and a shot of a bird of prey morphs into a backlit figure of a man standing in the grasslands at sunset. “In 1916, America was changing,” the narrator says in the trailer’s opening seconds. An image of a railcar passing over a bridge fades into a scene of factory workers digging through heaps of coal, followed by another wide frame of an empty sunbaked wheat field. The viewer is then swept into close-ups of the rough-hewn faces of the film’s stars – Gere, Shepard, and Adams – amidst passing railcars and horse-drawn carriages en route to the barren frontier [Fig. 5]. One minute and fifteen seconds into Trailer A, the serene mood and tone of the narrative changes abruptly. The narrator’s voice states that the film is “the story of a man who had nothing…the woman who loved him…and the man who would give her everything for a share of that love” (YouTube 2014e). With these words, the imagery moves away from the thematic scope of the land and its inhabitants and into the romantic dilemma at the heart of the film. A scene in which Gere’s field hand runs from law enforcement on horseback is juxtaposed with a quieter moment of his character embracing Adams in a quiet meadow. The next shot is an extreme close-up of Shepard’s watchful gaze, as if overseeing these scenes from afar.

As the narrative for Trailer A moves towards its conclusion, Adams and Shepard are photographed in silhouette inside the latter’s large estate, while the bedraggled face of Gere’s character peers up at the duo through a windowpane from below. This moment is framed from Gere’s perspective, with the actor and the encompassing field bathed in the deep blues of a Midwestern dusk, suggesting the loneliness his character will face with the coming of night. The film’s title appears over this closing shot, foreshadowing a troubled outcome for the trio. Trailer A presents much of the entire film’s narrative in under two minutes; what begins as a broad glimpse of turn-of-the-century westward expansion in the U.S. evolves into a minor tale of lost love [Fig. 5]. Thematically, the trailer’s primary audiovisual message suggests a heightening of nostalgia for both the American West and the Hollywood of the late 1970s, with the mythic qualities of innocence and utopia highlighted in the cinematography and production design [Fig. 6]. The professionalism of the editing in Trailer A, including the pairing of shots and sequence evolution provides a seamless story arc. Thus, the inclusion of Paramount’s introductory logo, the ‘70s-era voice-of-God narration, and the production elements suggests that this user-exhibited fan trailer was re-appropriated without revising the original trailer’s content. Therefore, Trailer A is most likely an original trailer for the film repurposed by one or more online fans. Trailer B also provides a visually compelling narrative to signal nostalgia and romanticism for the American West. But here, the viewer is immediately transplanted into to the lives of the film’s primary characters without the broader introduction of the land and its inhabitants as witnessed in Trailer A [Fig. 7].

Fig. 7: The film’s use of natural light to emphasize dramatic elements is also highlighted within the trailers.

Fig. 7: The film’s use of natural light to emphasize dramatic elements is also highlighted within the trailers.

The opening shot in Trailer B, a striking low-angle image of Gere, Adams, and the younger Manz running to catch a moving train, introduces the film’s predominant family dynamic.[5] Next is a shot of moving railcars topped with migrant travelers that segue into multiple close-ups of these characters’ hardened faces. Already, the viewer is guided toward the themes of travel and migration. The following image shows the Gere, Adams, and Manz trio atop one of the railcars, amidst the masses, fleeing the East for better opportunities. The rest of Trailer B’s running time focuses on the romantic triangle that ensues. Several important elements in Trailer B suggest a greater degree of user- repurposing. Manz’s tinny backwoods drawl, taken from the film’s narration, guides the trailer’s audio track for much of the running time, and is backed by a second musical track of delicately plucked guitar strings. In addition, the caption for Trailer B, located just below the video player on YouTube, states, “Bill, Abby, and sis arrive on the panhandle,” a sentiment only marginally correlated with the majority of the trailer’s visual narrative (YouTube 2014f). Another item that speaks to user re-appropriation is the individual shot duration, which moves at a more leisurely pace here, and seems to have been edited mostly to match Manz’s voiceover.

Further suggestive of fan involvement with Trailer B’s content is the abrupt segue from Manz’s voice and the guitar string audio tracks to the ambient sounds of trotting horses and rolling wagon wheels. Visually, the nonprofessional editing is emphasized at this point as well, with a sequence in which Gere’s character is propositioned for work by a land baron, a moment that is abruptly interrupted by a long shot of migrants moving en mass across the prairie. Throughout the two and a half-minute running time for Trailer B, the mood and tone shift in favor of different scenes from the film that drive the trailer towards a questionable conclusion. Marketing and film promotion is immediately evident on the webpage for Trailer C [Fig. 8]. The Criterion Collection offers viewers the option of purchasing several DVD versions of the film, reading a written essay on the film’s historical significance, a list of DVD special features, and links to related films from the company in addition to the trailer.

The trailer itself, however, is constructed from film content not included in Trailers A and B. In this much-abridged version, the guitar audio track preceding Manz’s narration is audibly fragmented and disassociated from any cohesive visible narrative. As such, the film’s primary visual content is made up of close-ups of the nondescript faces of migrants overlooking a land of grazing crows and antelope on the abandoned prairie. Here, Manz’s brief narration serves to introduce the film’s quiet mood and leisurely pacing. The aforementioned scene of Gere interacting with the land baron is cut prematurely in Trailer C, presumably for purposes of keeping the trailer’s length under the running time of one minute. In this version, the scene that introduces the bullhorn-gripping farm owner is interrupted by an establishing crane shot that places the viewer in the midst of migrants scampering towards the opportunity of work. Each of these moments take up several seconds worth of running time, and Criterion’s trailer closes abruptly with a surprising fade-to-black.

Fig. 11: Criterion’s webpage for Days of Heaven offers visitors a number of options to interact with the film.

Fig. 11: Criterion’s webpage for Days of Heaven offers visitors a number of options to interact with the film.

Whereas the finales of both fan-appropriated trailers on YouTube are classically structured to mirror the resolutions found in many trailers of the 1970s, the transition to a black frame in Trailer C suggests a different kind of closure. The trailer concludes by returning to a still frame of six farmhands standing in awe of an insect downpour, a somewhat iconic image from the famous “locust scene” in the film. This visual placeholder is representative of Criterion’s idyllic version of the film’s significance. As such, this striking still image speaks directly to curating the memory of Days, arguably more so than the totality of the narrative for Trailer C. Although the design of the distributor’s webpage is simultaneously content-heavy and visually arresting, this emblematic still frame stands apart, begging the visitor to click, watch or re-watch and possibly, purchase the film from the distributor.

Feedback on Heaven: The Online Discourse of Cinematic Aesthetics & Nostalgia

The contents of three hundred user-generated message board posts for Trailers A, B, and C were analyzed for this study. Most of this feedback was found to be praiseworthy of Days, with many of the user-posts lauding the film’s cinematography. The discourse on Criterion’s webpage for the film was overwhelmingly positive and found to reflect the distributor’s marketing intentions. “A beautiful spectral and view of the early 1900s mid-western America,” Mike Santoro wrote on the message board. “I love Malick’s brilliant direction in this [film]” (The Criterion Collection 2014b). Others commentators on this webpage used specific discourse that intertwined aspects of their real-world lives with the film’s history and nostalgia. “My first Malick movie, discovered when I was watching every movie on’s ‘101 Movies To See Before You Die,’” Taylor P. stated. Bennett Duckworth wrote, “…thanks Dad for introducing this movie to me.” And mimicking Manz’s drawl in the character’s narration, Arthur Mhoyan said, “There were people sufferin’ in pain and hunger. Some people their tongues were hangin’ out of their mouths” (The Criterion Collection 2014c).

While single-word and somewhat elusive statements, such as “Breathtaking” and “Beautiful,” were found on the Criterion message board as well, much of the feedback was more detailed and descriptive. The lack of negative comments on the message board is further indicative of Criterion’s approach to online publicity and distribution for the film. In turn, the majority of user-feedback for Trailers A and B on YouTube was specifically targeted at the film’s cinematography. Equal parts excitement and praise for the film’s imagery was evident on both message boards, suggesting that the film’s visual approach is endorsed through fan-recall on these video-streaming webpages. For example, the “GregF” channel wrote, “…all 5 [of] Malick’s movies are beautiful but there are no words to describe Days Of Heaven…pure magic.” The “44eelz” channel posted, “i haven’t seen this movie yet but the cinematography looks amazing.” The “ErikHutt” channel added that “[Days] was shot in Alberta,” and the “MrKeepitunderyourhat” channel said, “To be honest, I’d say that the most famous aspect of the entire film is its magic hour cinematography” (YouTube 2014g).

The similarities in the content and tone of the statements analyzed across all three webpages suggest that fans are fond of the film’s historical significance and imagery. The cause-effect nature of this discourse also acts as an effort to keep the film in memory while promoting it to others. The content of this rhetoric also signifies the film’s ability to evoke an era in Hollywood history in which aesthetic power swayed and captivated audience members. In sum, much of this online discourse speaks to how film fans in online spaces curate the myth and nostalgia of aging mainstream film properties. Much of these statements reflect a sincere familiarity with Malick’s production design and the aesthetic properties of the cinematography. The statements under analysis, therefore, speak to the role of message boards in film advertising as well as the intricacies of fan-generated promotional feedback.

Promoting Hollywood Through the Fan Trailers: The Archive in Transit

YouTube. 2015. “Honest Trailers.” Accessed February 11, 2015.

This article investigated how participatory cultures use fan trailers to engage with aging Hollywood titles in online spaces. The findings suggest that online film fans utilize fan trailers to interact with others while drawing attention to archival film properties. In effect, the findings from this study demonstrate several ways in which trailer repurposing and exhibition on the Internet aids in developing fan support around older motion pictures. An upsurge in fan trailers on the Internet is a burgeoning avenue of marketing for Hollywood studios and film distributors. Through new media platforms, fan trailers have the potential to reach global audiences and encourage social networking and commentary. In this study, the number of fan trailer views and user-generated message board posts was found to play a role in supporting interest in online film content. The audiovisual elements of both fan trailers for this study were generated from existing film content and repurposed to varying degrees. Specifically, the fan-edited trailer content was found to draw attention to the emotive properties of the film text. Collectively, the trailer narratives for this study presented an overwhelmingly favorable image of the case study film, as well as its historical significance and nostalgic qualities. The textual or written discourse analyzed in message boards on the webpages under investigation was found to shape the collective memory of the case study film as well. The content from this portion of the analysis also helped in preserving a positive view of the film itself, with much of the user-generated feedback positioned to promote the film’s cinematography and production design.

The composite findings indicate that fan trailers play a detrimental role in reviving older studio properties. The unintended consequences of these actions suggest a new avenue for media conglomerates and/or film distributors in marketing older motion pictures in the digital era. With Hollywood making fewer “midrange films [with] distinctly American subject matter,” such as Days of Heaven, smaller production companies and independent channels are overtaking this once-profitable market (Goldstein 2012). The role(s) taken on by members of participatory cultures, as well as the long-term effects of their interventions in online spaces, remains to be seen. For aging Hollywood film, fan trailers appear to offer one example of a promotional tool for film distribution and archiving. In June 2015, more than 88 million viewers had accessed 107 mock fan trailers through Honest Trailers, the YouTube-hosted channel by Screen Junkies (YouTube 2015). As Erickson (2009b) suggested, “with rapidly evolving technological features and equipment, tomorrow may yield an entirely new approach to using the Internet in a film promotion campaign” (51). As technological advancements in cinema and digital media continue to unfold, new online platforms and Web channels are creating an increasing number of spaces for participatory cultures and motion pictures. While many of these changes are on the horizon, scholars have predicted a continuous stream of content-related interruptions from tech-savvy film fans, as well as an evolution in the blending of virtual selves with cinematic information in cyberspace (e.g., Hansen 2006; Hardt and Negri 2004). Although the art of re-appropriating film content on the Internet has ballooned into a truly mass phenomenon, the future and direction of the fan trailer will depend on the negotiated balance between online cinephiles and digital control of motion picture properties.


Avatar. 2009. Directed by James Cameron. USA: 20th Century Fox.

Bonnie and Clyde. 1967. Directed by Arthur Penn. USA: Warner Brothers.

Days of Heaven. 1978. Directed by Terrence Malick. USA: Paramount Pictures. 

Gone Girl. 2014. Directed by David Fincher. USA: 20th Century Fox.

The Blair Witch Project
. 1999. Directed by Daniel Myrick and Eduardo Sanchez. USA: Haxan Films.

The Lord of the Rings: The Fellowship of the Ring. 2001. Directed by Peter Jackson. USA: New Line Cinema.

The Lord of the Rings: The Two Towers. 2002. Directed by Peter Jackson. USA: New Line Cinema.

The Lord of the Rings: The Return of the King. 2003. Directed by Peter Jackson. USA: New Line Cinema.

YouTube. 2008. “Days Of Heaven – Trailer (1978).” Last modified April 17, 2008.

YouTube. 2009. “Days of Heaven – Terrence Malick (1978).” Last modified on November 9, 2009.

YouTube. 2013. “Days of Heaven – Trailer.” Last modified on April 13, 2013.

YouTube. 2013. “Days of Heaven–A Video Essay.” Last modified on October 16, 2013.

YouTube. 2015. “Honest Trailers.” Accessed February 11, 2015.

Days of Heaven’s 1978 box-office gross was $3.5 million nationwide. Compare this figure to other mainstream studio releases of 1978 that received Oscar attention and critical acclaim, such as Heaven Can Wait ($81.6 million) (Beatty 1978), The Deer Hunter (roughly $49 million) (Cimino 1978), and Midnight Express ($35 million) (Parker 1978) (BoxOfficeMojo 2014).

[2] At the time of this writing, three Malick-directed films are in various stages of development, with his next feature, Knight of Cups, scheduled for wide release in 2015.

[3] The one-hour, thirty-three minute feature film is also available for rent or purchase on YouTube.

[4] Paramount Pictures’ YouTube channel displays fewer than 4,000 posts.

[5] This image is also used near the end of Trailer A, primarily to symbolize the passage of time for migrants moving from urban to rural areas.


Kyle R. McDaniel is a doctoral candidate in the School of Journalism and Communication at the University of Oregon. His research interests include the intersections between American cinema and digital culture in the 21st century. His forthcoming dissertation focuses on the usage and repetition of visual effects in contemporary documentary film.


“Children should play with dead things”: transforming Frankenstein in Tim Burton’s Frankenweenie – Erin Hawley

Abstract: In this paper, I explore the possibility of retelling Mary Shelley’s novel Frankenstein in a children’s media text.  Like most material within the horror genre, Frankenstein is not immediately accessible to children and its key themes and tropes have traditionally been read as articulations of “adult” concerns.  Yet Frankenstein is also a tale with surprisingly child-centric themes.  With this in mind, I consider how the Frankenstein tale has been transformed within the constructed space of a child’s worldview in Tim Burton’s 2012 animated film Frankenweenie.  I argue that the film neither simplifies nor expresses great fidelity to Shelley’s novel, but instead cultivates a sense of curiosity and cultural literacy regarding the Frankenstein tale and the horror genre itself.

Sparky the dog. Frankenweenie (Tim Burton, 2012)

Sparky the dog. Frankenweenie (Tim Burton, 2012)

The horror genre has long been considered “off limits” to children.  From the rewriting of fairytales to erase their violent and scary content (Zipes 1993) to the literal defacement of eighteenth century children’s literature to remove traces of the Gothic (Townshend 2008), efforts to disentangle children’s texts from horror have given rise to the notion that children cannot derive the same sort of pleasure from “being scared” that adults can.  Recent scholarship has suggested, however, that children can and do take pleasure in horror material.  In her work on child cinema audiences in Britain, Sarah Smith has found that horror films in the 1930s were “extremely popular with children” due to the “mixed feelings of fear and fun” they evoked (2005, 58).  Writing of James Whale’s film Frankenstein (1931), Smith observes that children were “fascinated by its appeal and attended in droves” (2005, 70).  Similarly, David Buckingham’s research into children as horror viewers reveals that, while fright reactions to horror material can be powerful and long-lived, child audiences also take pleasure in the conventions of the horror text – they enjoy watching “evil destroyed” but also watching it “triumph”; they enjoy the feeling of fear itself and, like adult viewers, find pleasure in horror’s momentary destabilisation of societal norms (1996, 112-116).

The pleasures of horror from a child’s perspective have also been explored by Neil Gaiman (2006), who tells an interesting story about his daughter’s fascination with James Whale’s The Bride of Frankenstein (1935).  “My daughter Maddy loves the idea of The Bride of Frankenstein,” he writes: “she’s ten”.  Such fascination leads to dress-ups and play, and eventually to young Maddy and her friend watching the horror classic under Gaiman’s supervision.  When confronted with the movie itself, however, the enthusiasm wanes: the kids don’t get it.  As Gaiman observes, “They enjoyed it, wriggling and squealing in all the right places. But once it was done, the girls had an identical reaction. ‘Is it over?’ asked one. ‘That was weird,’ said the other, flatly. They were as unsatisfied as an audience could be”.

To some extent, this reaction is not surprising.  The Bride of Frankenstein is based on Mary Shelley’s Gothic novel Frankenstein, a text that – like most material within the horror genre – is usually read as an articulation of decidedly adult concerns.  From the original novel to its more recent manifestations in popular media, the Frankenstein tale is peppered with depictions of violence and violation, murder and misogyny; across the long history of its remaking in popular culture it has been interpreted as a story about genetic manipulation (Waldby 2002, 29), sexual transgression (Mellor 2003, 12-13), and post-partum depression (Johnson 1982, 6), to name just a few of its more adult-centric resonances.

Yet Frankenstein is also a tale with surprisingly child-centric themes.  At its heart, it is a story about what it means to be an outsider and what it means to encounter, experience, and negotiate otherness; these are themes that have more recently been explored by writers of children’s and young adult fiction from Roald Dahl to Stephenie Meyer.  As Barbara Johnson has pointed out, Frankenstein is also essentially a story about parent/child relationships: with its themes of monstrosity and technology, Johnson tells us, Shelley’s novel explores “the love-hate relation we have toward our children” (1982, 6).  Building on Johnson we can suggest that by offering us a glimpse of the world through the monster’s eyes the novel also briefly presents this “love-hate relation” from the child’s perspective, and that decades of Frankenstein movies continue this by offering the misunderstood monster as an icon of all that is unruly, confused, and frightening about childhood itself.

The story Gaiman tells about his daughter’s fascination with The Bride of Frankenstein and her reaction – “that was weird” – to the movie itself is a lovely articulation of the way children may be simultaneously drawn to and locked out of the Frankenstein tale.  It is interesting to note that Gaiman’s daughter and her friend were not frightened by the film or put off by its horror elements (indeed, they seemed to enjoy this aspect of the movie, “wriggling and squealing in all the right places”); instead, it was a certain indefinable strangeness that informed their ultimately “unsatisfied” reaction.  All this suggests that children can engage meaningfully and pleasurably with material in the horror genre, especially if that material is rewritten with a child’s perspective in mind.

In this article, I explore the relationship between Frankenstein and young audiences and consider the possibility of retelling Shelley’s novel in a children’s media text.  My analysis is inspired by the recent appearance of characters from Shelley’s novel and its various adaptations in three children’s animated films: Frankenweenie (Tim Burton, 2012), in which a boy named Victor Frankenstein reanimates his dog Sparky after a tragic car accident; Igor (Anthony Leondis, 2008), in which a hunch-backed laboratory assistant brings a female monster to life; and Hotel Transylvania (Genndy Tartakovsky, 2012), in which the Frankenstein monster and his Bride join Dracula and a host of other characters from the horror genre.  This trend towards engaging the Frankenstein myth in children’s media begs the question: how have such texts made Shelley’s tale accessible to young audiences, and with what degree of success?

Below, I take up this question with specific reference to Tim Burton’s Frankenweenie.  Not only is Burton’s film (as we shall see) the most highly regarded and in some senses the most successful of these three texts, it is also the most complex and arguably does not “dumb down” its source material.  My analysis of Frankenweenie will examine how the film constructs a “child’s eye view” and transforms the Frankenstein tale so that its characters, themes, and narratives make sense within the imagined space of a child’s world.  I will demonstrate that Burton’s film captures the spirit of its source text without necessarily striving for fidelity.  I will also consider some of Frankenweenie’s extra-textual material, exploring how reviews, product tie-ins, and even the film’s intertextual references contribute to its overall project of transforming but not simplifying the Frankenstein tale for children.

Adaptation, simplification, and transformation

Victor and his dog Sparky from Tim Burton's homage to the Frankenstein story, Frankenweenie (2012).

Victor and his dog Sparky from Tim Burton’s homage to the Frankenstein story, Frankenweenie (2012).

Frankenweenie is a stop-motion animation inspired by Burton’s earlier live-action film of the same name.  Here, the Frankenstein tale is relocated to one of Burton’s characteristic suburbia-scapes (“New Holland”), complete with manicured lawns, hedge sculptures, and monstrously mediocre residents.  Within this new narrative space, “Victor Frankenstein” is a child: a troubled, creative loner who spends his time tinkering in the attic, playing with his beloved dog Sparky, and making movies.  Tragedy enters Victor’s life when Sparky is killed in a car accident.  Inspired by his science teacher, the delightfully dour Mr Rzykruski, Victor steals Sparky’s body from the pet cemetery, drags the corpse back to the family home, and reanimates him in the attic.  When Victor’s classmates learn his secret, they try to replicate the experiment.  Chaos ensues as pets both living and dead are transformed into monsters who descend on New Holland, leading to a climactic showdown at the windmill overlooking the town.

The relationship between Frankenweenie and its source text, Mary Shelley’s Frankenstein, is complex.  Burton’s film both diverges from and intersects with Shelley’s novel, defining itself through patterns of fleeting fidelity and moments of spectacular transformation.  At the same time, the film makes reference to a plethora of other texts both within and beyond the Frankenstein mythos, thereby demonstrating the ways in which “adaptation” approaches and merges with “intertextuality” (see Elliott 2014; Martin 2009; Leitch 2003).  In other words, Frankenweenie is by no means a “faithful” retelling of Shelley’s Frankenstein.  It should be noted, however, that Shelley’s novel – despite being adapted many times in the centuries since its publication, across different media and in different genres – has not tended to inspire fierce fidelity in adapting authors.  As Albert Lavalley points out, Frankenstein tends to be “viewed by the playwright or the screenwriter as a mythic text, an occasion for the writer to let loose his own fantasies or to stage what he feels is dramatically effective, to remain true to the central core of the myth, [but] often to let it interact with fears and tensions of the current time” (1979, 245).

The notion of “fidelity” to an original text as the means of measuring an adaptation’s success, strength, and value has itself been thoroughly contested and problematised in recent years.  Fuelled particularly by the work of adaptation theorists such as Robert Stam (2005), Thomas Leitch (2003, 2007), and Imelda Whelehan (1999), this problematisation of the fidelity model has been an intervention in established ways of thinking about the relationship between an adaptive text and its source material.  As Will Brooker observes, though, fidelity criticism may be “outmoded and discredited within academia” but it has managed to “retain its currency within popular discourse” (2012, 45); in particular, it still informs the critical reception of films that adapt well-known novels or works of literature.  Even within academia, moreover, fidelity criticism has tended to linger in discussions of children’s media texts, particularly when the texts in question are retellings of classic or literary works.  It is often assumed that such adaptations carry some degree of responsibility for encouraging children to read and connect with the source material (Napolitano 2009, 81); in this way, the issue of fidelity becomes more urgent in the context of children’s media.

Concerns about fidelity in children’s adaptations are compounded by the issue of simplification.  Frankenweenie, for instance, is both an adaptation of a literary text and a reworking of classic horror films within the space of a child’s animation: the question we may immediately wish to ask, then, is “what has been lost in this process?”.  Both Shelley’s novel and the films of James Whale are today held in high regard as cultural classics, while the Frankenstein myth itself is a repository of ideas and cultural conversations about selfhood, embodiment, subjectivity, life, and death.  Potentially, the simplification of this myth for children would involve more than just a strategic removal of violent and sexual content in order to achieve a PG rating: it would be a process of dumbing down, a cleaning up of a story that works best when it is not “clean”.  It would also be a form of commercialisation, a reduction of a complex tale so that it can be packaged and marketed to young audiences.

These problems of simplification, commodification, and the dumbing down of source material are frequently mentioned by analysts of children’s adaptations, especially when the adaptation in question is a Disney product (as is Frankenweenie).  Writing in 1965 for the journal Horn Book, Frances Sayers refers to the “sweet” and “saccharine” nature of Disney adaptations and argues that, in order to both address and construct a child or family audience, Disney texts present life as lacking “any conflict except the obvious conflict of violence” (609).  Her concerns have been echoed by Hastings, who writes of the “conscious effort [by Disney] to produce children’s movies with no alarming moral ambiguities” (1993, 84).  Zipes, in turn, laments the way Disney has “‘violated’ the literary genre of the fairytale and packaged his versions in his name through the merchandising of books, toys, clothing, and records” (1995, 38).  Marc Napolitano’s work on the “Disneyfication of Dickens” is particularly relevant here because it explores the intersection between adult literature and children’s media.  Napolitano argues that the films Oliver & Company and The Muppet Christmas Carol – both Disney texts that retell canonical works by Charles Dickens – are “simplified and sanitized adaptations of Dickens that were marketed to families by the Walt Disney Company” (2009, 80).  In Oliver & Company in particular, Napolitano argues, Disney “lightens the material significantly and uses cute, cuddly animal characters, all of whom would be reproduced as stuffed toys, McDonald’s Happy Meal prizes… and countless other types of child-friendly merchandise to market the film to kids” (2009, 82).

All such criticism of Disney’s treatment of literary material is important, and functions as part of a wider interrogation of the seemingly “apolitical” and “critically untouchable” world of children’s animated film (Bell, Haas, and Sells 1995, 2).  As analysts of Disney products, it is essential that we disentangle ourselves from our own enjoyment of the Disney “magic”; this is part of what Zipes has called “Breaking the Disney Spell” (1995).  At the same time, however, claims about “Disneyfication” can be problematic when they make sweeping assumptions about young audiences, their levels of media literacy, and the ways in which they engage with media texts.  In other words, when accusing a children’s text of simplification we ourselves risk making an overly simplified reading of the child audience.  The charge of simplification becomes especially problematic when it lapses into what Semenza (2008) has termed the “dumbing down cliché”: the notion that adapting a literary text for children must always and automatically involve a process of reduction and commodification.

In the context of these concerns, Frankenweenie provides us with an interesting example because it resists the simplification process and simultaneously encourages its young audience to reconnect with the source material through means other than fidelity.  The film’s refusal to “Disneyfy” the Frankenstein tale is signified by the transformation of the Disney logo in the opening sequence: lightning strikes the familiar Disney castle and the picture turns black and white, a suggestion that there will be no fairies or cute, singing animals in the film that follows; that there will be no attempts to render the Frankenstein tale “safe” and “simple” even as it is opened up for young viewers.  While this transformation of the Disney logo is indicative of the mediating presence of Burton in the adaptation process, and of the supposed clash between the Disney and Burton brands, it can also be read as a resistance to simplification – a suggestion that the film will be “Frankenstein for kids” but not “Frankenstein lite”.

In what follows, I explore how Frankenweenie transforms (rather than simplifies) the Frankenstein tale within the imagined space of a child’s world.  I use the term “transformation” with an awareness of its applicability to studies of animation as an art form, a technology, and a mode of representation.  As both Susan Napier (2000) and Paul Wells (1998) have noted, animation has metamorphic qualities that distinguish it from live-action cinema and that manifest at the levels of story, body, and space.  Wells has also argued that the process of adapting a literary text into animated film can involve “an act of literal transformation which carries with it mythic and metaphoric possibility” (2007, 201).  In this way, the idea of transformation allows us to discuss children’s animated films as adaptations without making assumptions about animation as a medium (for instance, that it is inferior to live-action cinema) or about children as audiences (for instance, that they are incapable of understanding textual and intertextual complexities).

Transformation and the child’s eye view

In her analysis of the filmic adaptation of Maurice Sendak’s beloved picture book Where the Wild Things Are, Sarah Annunziato (2014) explores the construction of a “child’s eye view”, arguing that the film – while drawing attention and public comment for its scariness and mature themes – is appropriate for young audiences because it imagines the world as seen through a child’s eyes.  Similar claims can be made of Frankenweenie, which constructs a child’s view of the world and repeatedly invites its viewers to inhabit this childlike space.  In both these films, the creation of a child’s eye view specifically involves moments of scariness rather than excluding them.  The relationship between Frankenstein and Frankenweenie differs, however, from that between the book and film versions of Where the Wild Things Are because it involves a shift from adult to child audience.  In Burton’s film, therefore, employing a child’s perspective allows for a significant rethinking of the original tale.

In its simplest sense, this child’s eye view is visible in the depiction of New Holland and its residents.  While certainly reminiscent of some of Burton’s other visions of suburbia, New Holland is best described as a small town landscape seen through a child’s eyes: a place of long shadows and neat lines, of fantasy and darkness, of strange children and menacing adults.  The frequent use of low camera angles to depict some of these adult characters (such as the mayor Mr Bergermeister and the science teacher Mr Rzykruski) aligns us with Victor and invites us to adopt a child’s perspective.  While not menacing or imposing, Victor’s parents, too, are adults as seen by children: simplistic to the point of caricature, caught up in trivial or meaningless “grown-up” concerns (Victor’s father talks endlessly about his work as a travel agent; Victor’s mother is repeatedly seen vacuuming the house and/or reading romance novels).  On the other hand, the world of Victor (the child’s world) is depicted as complex, detailed, and intricate.  This is best represented by the attic, a cluttered space of creativity, invention, and play – and a notable contrast with the rest of Victor’s house and suburb, which are neat, sparse, and boring.

This inherent difference between adults and children – and the resultant conflict, always seen from the child’s perspective – is central to the plot of Frankenweenie.  From the opening scenes we learn that Victor is misunderstood by his mostly well-meaning parents, who worry that he spends too much time alone and will “turn out weird”.  His father encourages Victor to take up baseball, which leads inadvertently to Sparky’s death: the little dog meets his doom while chasing a ball hit by Victor.  The subsequent depiction of Victor’s grief is highly moving, all the more so because his parents do not seem to understand the extent of his sadness.  His mother offers clichés and platitudes: “If we could bring him back, we would” and “when you lose someone they never really leave you – they just move into a special place in your heart”, which Victor interprets as hollow and macabre (“I don’t want him in my heart”, he objects, “I want him here with me”).  These early scenes suggest a sense of turmoil beneath the calm surface of even the most loving parent-child relationship: a version, perhaps, of the “love-hate relation” that Johnson (1982, 6) detects within Shelley’s visions of monstrosity.  They also reveal that the world seen through a child’s eyes is not a simple place, even though it may be dominated by fantasies (such as the desire to bring Sparky back to life, which Victor soon fulfills).

It is through these early depictions of conflict, death, and grief that the film captures the thematic spirit of Mary Shelley’s novel.  In Shelley’s text, Victor Frankenstein is driven to create his monster by a desire to suspend mortality and escape the horrors of death and decay: Shelley’s Victor is both haunted and inspired by the death of his mother, Caroline, which leads him to seek out scientific means of “renew[ing] life where death had apparently devoted the body to corruption” (Shelley 1993, 43).  For children, the death of a pet is often a first experience of mortality; thus in Frankenweenie it is the dog, Sparky’s, death that allows Victor to confront the notion of perishability that so horrifies his predecessor in Shelley’s novel.  This experience of death and perishability also precipitates the story events and initiates the move into the horror genre by inspiring Victor’s act of monster-making.

The scene in which Victor reanimates Sparky provides Burton and his team with much opportunity to revel in horror movie history and to pay homage to the films of James Whale, particularly Frankenstein (1931) and its sequel, The Bride of Frankenstein (1935).  Lightning flashes and thunder crashes as Victor sews Sparky’s body back together and fixes bolts to his neck; the body is then covered by a sheet and raised through the roof to receive the life-giving electric charge.  Yet here, too, the child’s eye view is at work.  Attentive viewers will notice that Sparky’s body is laid out on an ironing board, and that toys, appliances, and other household objects form part of the elaborate life-giving apparatus.  Signifiers of “childhood” and “ordinariness” are thus interwoven with the signifiers of life, creation, and monstrosity borrowed from Whale.  Instead of fingers twitching and eyes opening, Sparky’s “alive-ness” is signified by a wagging tail; and instead of proclaiming “It’s alive!” like his predecessor in the Whale films, young Victor Frankenstein says “You’re alive”.  This shift in language reveals that the monster has been created according to a child’s desires and wishes: the moment of creation is framed by Victor’s desire not only for Sparky to still be alive but for the friendship, happiness, and unconditional love that a pet often represents.  Accordingly, the child views the monster as a friend and companion (you) rather than as the product of an experiment (it).

This transformation of Frankenstein to suit a child’s perspective certainly involves a degree of softening, a removal of some aspects of violence and conflict that define the original tale.  For instance, Shelley’s novel and most of its adaptations are constructed around the conflict between monster and maker – this conflict is not present in Frankenweenie.  As we would expect given the film’s target audience, Burton and his screenwriter John August also de-sexualise the Frankenstein tale: another notable absence is Shelley’s sub-plot involving the creation of a mate for the monster, and the resultant murder of Victor’s bride Elizabeth on their wedding night.  This does not mean, however, that Frankenweenie shies away from an exploration of monstrosity and horror.  Indeed, while Sparky himself is not depicted as a true monster, the film is replete with images of monstrosity.  These come particularly in the form of the creatures that Victor’s classmates bring to life: pets and other icons of familiarity, domesticity, and innocence (sea monkeys, a fluffy white cat, a dead hamster) who become snarling, terrifying, rampaging beasts.  The image of these monsters running amok through the fairground of New Holland encapsulates the film’s transformation of its source material.  This scene is only tenuously connected to Shelley’s plot, yet it resounds with Frankensteinian questions and dilemmas, particularly as they might be understood by children: When you have created your monster, what are you going to do with him/her/it?  And what happens if your monster (your game, your story) escapes your control?  While exploring the lighter side of monster-making, then, the film also explores the darker side of play, re-interpreting the Frankensteinian themes of creativity, perishability, and the life/death boundary so that they are seen from a child’s perspective.

Paratexts, intertexts, and the complex world of Frankenweenie

From Frankenweenie: an Electrifying Book. One of the many examples of the process of stop-motion animation and the making of Sparky.

From Frankenweenie: an Electrifying Book. One of the many examples of the process of stop-motion animation and the making of Sparky.

While not a notable box office success, Frankenweenie received a generally positive critical reception.  The film is rated highly – at 87% – on the aggregate review website Rotten Tomatoes, and is frequently described by reviewers as an enjoyable product for both children and adults (see, for instance, Paatsch 2012; Chang 2012; Mazmanian 2012).  Occasionally, charges of simplification are levelled at the film: Peter Bradshaw in The Guardian describes Frankenweenie as “a sentimental kind of retro gothic lite, appearing under the Disney banner” (2012), while A.O. Scott in the New York Times writes that “the movie, a Walt Disney release, also feels tame and compromised” (2012).  Other reviewers found the film dark enough to be entertaining, with many making positive mention of Burton’s ability to balance the sweetness of a children’s story with the darkness of a horror film.  Leigh Paatsch in the Herald Sun, for instance, commends the film for “deftly balancing blatant eeriness with a chipper cheeriness that excuses many a macabre event” (2012).  Lou Lumenick in the New York Post praises the film for its “creepy but basically sweet humor” (2012), as does Matthew Bond in the Daily Mail Australia who describes it as “strange, but also touching and lovely” (2012).  In Time magazine, Richard Corliss addresses the film’s boundary-crossing quality when he notes approvingly that “Frankenweenie’s message to the young” is that “children should play with dead things” (2012).

This positive reception sets Burton’s film apart from other recent children’s films that play with horror tropes and characters, such as the aforementioned Hotel Transylvania and Igor, both of which received lukewarm reviews.  Hotel Transylvania in particular was frequently criticised for its shallow approach to the narratives it draws upon, including Shelley’s Frankenstein (see, for instance, Reynolds 2012; Collin 2012).  L. Kent Wolgamott in the Lincoln Journal Star (2012) observes that while Frankenweenie did not perform as well at the box office as Hotel Transylvania, it is “by far, the superior film” (and he contextualises this comment by urging readers not to “consider box-office returns to be the only measure of a film’s success”, adding that with Frankenweenie Burton has created a “masterpiece”).

Some reviews of Frankenweenie mention the construction of a child’s eye view.  Adam Mazmanian in The Washington Times, for instance, identifies this as the means by which the film “draw[s] in young audiences”, adding that its “knowing winks at horror-movie history will appeal to grown-ups” (2012).  It is interesting that Mazmanian feels the need to separate the film’s audience into these two distinct categories, and that he distinguishes the “adult” and “child” sections of the audience by an ability (or lack thereof) to “get” the film’s intertextual references.  Wolgamott takes this further, praising the film for its references to classic horror movies but adding “that’s not anything the preschool through middle school animation crowd is going to get, or could possibly care about” (2012).  Both critics agree that intertextuality is a means by which Frankenweenie resists simplification and becomes something more than a light and fluffy children’s film.  At the same time, both critics produce distinct readings of the film’s child and adult audiences, and locate the qualities of media literacy and cultural awareness (which might enable the decoding of the film’s intertextuality) squarely within the adult space.

It is certainly true that Frankenweenie is littered with intertextual references: to other texts in the Frankenstein mythos (particularly the films of James Whale), to films in Burton’s oeuvre (such as Edward Scissorhands), and to texts in the horror genre more broadly (such as the Japanese monster movie Gamera).  This is coupled with a playful self-reflexivity that we often see in filmic adaptations of Frankenstein.  As Esther Schor (2003) has pointed out, adaptations of Shelley’s novel – from the early stage productions to the first known Frankenstein film in 1910, and beyond – often depict the monster’s coming-to-life in a spectacular and self-referential way; most filmic versions, in particular, play upon what William Nestrick (1979, 292) has termed the “myth of animation” – a thematic link or bridge between the Frankenstein tale and cinema’s own powers to bring a still image, body, or scene to life.  Frankenweenie, of course, is an animated film, and this brings new meaning to Nestrick’s “myth”.  The technologies of movie-making and, specifically, stop-motion animation are spectacularised in the image of Sparky’s coming-to-life, adding another layer of intertextuality to a film already rich with cultural references.

It may be tempting to assume, as do Mazmanian and Wolgamott, that children are excluded from this intertextual conversation.  Indeed, it has become increasingly common for children’s films to engage in a dual mode of address, enchanting children with stories, songs, and imagery while offering jokes, intertextual references, or clever moments of self-awareness to adults.  The implication is that long-suffering parents should be rewarded for watching films with their children or otherwise lured into the watching process by the promise of adult-centric entertainment.  Burton’s film is somewhat different because the intertextual references are closely bound to the narrative – they are less an amusing aside for adults than part of the film’s very fabric.  They are also, potentially, a means of encouraging audiences to connect with the source material.  While Frankenweenie does not openly strive to generate reverence for (or even awareness of) Mary Shelley, her novel, and the act of reading Frankenstein, it arguably promotes a more complex form of literacy that speaks directly to the process of adaptation itself.  By referencing the movies of James Whale, in particular, Burton positions his film within a web of Frankenstein texts and also destabilises the primacy of Shelley’s novel as source text: adapting Frankenstein, we are told, is a complex business that involves the engagement with already apparent intertextuality rather than the “recovery” of a single source text from out of the depths of adaptation history.

It is likely, furthermore, that many of the children who constitute Frankenweenie’s primary audience are able to decode the film’s intertextual references due to their familiarity with the horror genre and its tropes, characters, and conventions.  As noted above, children have traditionally been locked out of the horror genre; in recent years, however, encounters between young audiences and horror have been initiated through a plethora of child-friendly horror texts: as well as Hotel Transylvania and Igor, these include the films ParaNorman (Sam Fell and Chris Butler, 2012) and Monsters vs Aliens (Rob Letterman and Conrad Vernon, 2009), the video game Plants vs Zombies (PopCap Games, 2009), the books and television series Grossology (Sylvia Branzei, 1992-1997; Nelvana Limited, 2006-2009), and Chris Riddell’s Goth Girl books (2013-2015), as well as older but still relevant texts such as The Simpsons (which frequently lampoons the genre through its “Treehouse of Horror” episodes).  Meanwhile, imagery and tropes from the Frankenstein tale have been so pervasively circulated in popular culture for such a long time that children are likely to have some degree of familiarity with the tale even if they do not connect it to its original source.  Indeed, it is not uncommon for children’s texts to make passing reference to the tale and its characters (for instance, an episode of the cartoon Spongebob Squarepants is entitled “Frankendoodle”, while Dav Pilkey’s Captain Underpants books contain a character named “Frankenbooger”).

It is also likely that children today are familiar with complex levels of intertextuality and are adept at negotiating intersecting currents of media; thus Cathlena Martin writes of the “overlapping intertextual nature of children’s culture” (2009, 86).  In her analysis of the transmedia adaptation of the novel Charlotte’s Web, Martin claims that an enjoyment and understanding of intertextuality may come more naturally to today’s children, who “experience transmedia stories on a regular basis” and therefore “no longer view the printed text as the only way to experience [a literary classic such as] Charlotte’s Web”, whereas adults are more likely to “resist multi-media adaptation, relying on the supremacy of print text as ‘high art’” (2009, 88).  This returns us to the concept of “fidelity” to an original text, and suggests that in discussions of adaptation for children fidelity is likely to be a concept imposed by adult readers and critics rather than something inherently understood or valued by children.  If this raises concerns over the disappearance of “the book” as a cultural object, it also demonstrates that “simplicity” is not a concept that sits well with the highly interconnected, transmedia quality of children’s culture today.

The promotional release of the free Frankenweenie: an Electrifying Book. The e-book explores the production of Frankenweenie: readers are given access to production photographs, original artwork, and interviews. This is a promotional mock poster for a film titled "Return of the Vampire Cat".

The promotional release of the free Frankenweenie: an Electrifying Book. The e-book explores the production of Frankenweenie: readers are given access to production photographs, original artwork, and interviews. This is a promotional mock poster for a film titled “Return of the Vampire Cat”.

Interestingly, the promotional material for Frankenweenie played upon this ability in young audiences to understand and enjoy intertextuality.  Elliott reminds us that “[t]ie-in merchandise produces and distributes the culture of Disney beyond the cinema” (2014, 195); yet the marketing campaign for Frankenweenie took a very different route from the usual toys, games, and Happy Meals associated with Disney and with the process of Disneyfication.  Instead, the film was promoted through such unusual means as the release of six mock B-movie posters each featuring one of the child characters together with the monster he/she creates (including Night of the Were-Rat: a Tale of Terror featuring “Edgar E. Gore” and Return of the Vampire Cat featuring “Weird Girl”).  These promotional texts not only foreground the film’s child protagonists (as opposed to its adult characters) but serve to locate “childhood” within the parameters of the horror genre and within monster-movie history.  If entryway paratexts guide and instruct our viewing of a media text, as Gray (2010) suggests, these posters invite us to connect childhood with monstrosity in a way that “preps” us for the viewing of Frankenweenie itself (whether we are adults or children).  They also underscore the overall playfulness of the film and relatedly its resistance to the processes of Disneyfication and simplification.  Due to the foregrounding of the child characters, furthermore, the posters specifically address child audiences and clearly include them in the film’s intertextual conversation.

Another key aspect of the film’s promotion was the release of a free e-book entitled Frankenweenie: an Electrifying Book.  Designed for audiences of all ages, the e-book explores the production of Frankenweenie: readers are given access to production photographs, original artwork, and interviews, with particular emphasis on the process of stop-motion animation and the making of Sparky (who we can view as a sketch, a 3-D model, and a finished “product”).  In this way, the e-book allows children access to Nestrick’s “myth of animation” and to the idea of animation as a “bridge” between the narrative and the technology of Frankenweenie.  The e-book also makes the film’s intertextuality more evident.  It begins, for instance, with a foreward by actor Martin Landau accompanied an image of the character he voices (Mr Rzykruski); Landau discusses his previous collaboration with Tim Burton, the film Ed Wood, and his role in this film as Bela Lugosi, star of the horror classic Dracula (Tod Browning, 1931).  A pop-up button informs us that Landau’s character is also “a nod to Vincent Price, the late actor known for his iconic roles in various horror films” (Disney Book Group, 2012).  The e-book thus enables or enhances the ability of any audience member (including children) to decode the film’s intertextual references – and even, arguably, leads young audiences back to the various source texts that inspired Frankenweenie.

In this way, the film (together with its promotional material) both assumes and encourages a level of cultural literacy regarding the Frankenstein tale and, more broadly, the horror genre itself.  As this analysis has demonstrated, the film’s intertextuality works together with its paratexts to cultivate an awareness of what lies beyond its own textual boundaries.  Frankenweenie thus imagines and constructs its audience to be a media-literate and curious child.


Prior to the release of Tim Burton’s Frankenweenie, the thought of an animated film based on Mary Shelley’s Frankenstein and released under the Disney banner might have horrified literary purists and fans of horror cinema alike.  An animated Frankenstein, in which darkness and moral conflict are replaced by cute animal side-kicks and catchy songs, may well have been taken as a sign of Disney’s cultural domination and its ability not just to appropriate literary material but to colonise sites of literary and cultural meaning.  Burton’s film, however, demonstrates that “Disneyfication” is not the only route to adapting a literary classic for children, and that the transformation of such a tale within the space of a child’s worldview need not involve a simplification process.  As noted above, we can contextualise Frankenweenie within a recent trend in media and popular culture that has seen the horror genre re-imagined for young audiences; yet Burton’s film can be read not just as an example of “horror for kids” but as a startlingly successful transformation of a previously inaccessible tale in line with the concerns that define a child’s world.  Importantly, Frankenweenie’s most powerful images are not cartoonish renditions of monsters and mad scientists – they are the images of Victor grieving for Sparky, and of the neighbourhood kids struggling to control the monsters they have unleashed.  These themes of loss, and of losing control, are central to the film’s re-imagining of a classic horror tale according to a child’s eye view.  In this way, Frankenweenie makes Frankenstein accessible to children and also gives adult viewers a sense of what horror, otherness, and monstrosity could mean to a child.


Erin Hawley teaches in the Journalism, Media, and Communications program at the University of Tasmania.  Her current research interests include children's media culture, adaptation, and media education.

Read, Watch, Listen: A commentary on eye tracking and moving images – Tim J. Smith


Eye tracking is a research tool that has great potential for advancing our understanding of how we watch movies. Questions such as how differences in the movie influences where we look and how individual differences between viewers alters what we see can be operationalised and empirically tested using a variety of eye tracking measures. This special issue collects together an inspiring interdisciplinary range of opinions on what eye tracking can (and cannot) bring to film and television studies and practice. In this article I will reflect on each of these contributions with specific focus on three aspects: how subtitling and digital effects can reinvigorate visual attention, how audio can guide and alter our visual experience of film, and how methodological, theoretical and statistical considerations are paramount when trying to derive conclusions from eye tracking data.



I have been obsessed with how people watch movies since I was a child. All you have to do is turn and look at an audience member’s face at the movies or at home in front of the TV to see the power the medium holds over them. We sit enraptured, transfixed and immersed in the sensory patterns of light and sound projected back at us from the screen. As our physical activity diminishes our mental activity takes over. We piece together minimal audiovisual cues to perceive rich otherworldly spaces, believable characters and complex narratives that engage us mentally and move us emotionally. As I progressed through my education in Cognitive Science and Psychology I was struck by how little science understood about cinema and the mechanisms filmmakers used to create this powerful experience.[i] Reading the film literature, listening to filmmakers discuss their craft and excavating gems of their craft knowledge I started to realise that film was a medium ripe for psychological investigation. The empirical study of film would further our understanding of how films work and how we experience them but it would also serve as a test bed for investigating complex aspects of real-world cognition that were often considered beyond the realms of experimentation. As I (Smith, Levin & Cutting, 2010) and others (Anderson, 2006) have argued elsewhere, film evolved to “piggy back” normal cognitive development and use basic cognitive tendencies such as attentional preferences, theory of mind, empathy and narrative structuring of memory to make the perception of film as enjoyable and effortless as possible. By investigating film cognition we can, in turn advance our understanding of general cognition. But to do so we need to step outside of traditional disciplinary boundaries concerning the study of film and approach the topic from an interdisciplinary perspective. This special issue represents a highly commendable attempt to do just that.

By bringing together psychologists, film theorists, philosophers, vision scientists, neuroscientists and screenwriters this special issue (and the Melbourne research group that most contributors belong to) provides a unique perspective on film viewing. The authors included in this special issue share my passion for understanding the relationship between viewers and film but this interest manifests in very different ways depending on their perspectives (see Redmond, Sita, and Vincs, this issue; for a similar personal journey into eye tracking as that presented above). By focussing on viewer eye movements the articles in this special issue provide readers from a range of disciplines a way into the eye tracking investigation of film viewing. Eye tracking (as comprehensively introduced and discussed by Dyer and Pink, this issue) is a powerful tool for quantifying a viewer’s experience of a film, comparing viewing behaviour across different viewing conditions and groups as well as testing hypotheses about how certain cinematic techniques impact where we look. But, as is rightly highlighted by several of the authors in this special issue eye tracking is not a panacea for all questions about film spectatorship.

Like all experimental techniques it can only measure a limited range of psychological states and behaviours and the data it produces does not say anything in and of itself. Data requires interpretation. Interpretation can take many forms[ii] but if conclusions are to be drawn about how the data relates to psychological states of the viewer this interpretation must be based on theories of psychology and ideally confirmed using secondary/supporting measures. For example, the affective experience of a movie is a critical aspect which cognitive approaches to film are often wrongly accused of ignoring. Although, cognitive approaches to film often focus on how we comprehend narratives (Magliano and Zacks, 2011), attend to the image (Smith, 2013) or follow formal patterns within a film (Cutting, DeLong and Nothelfer, 2010) several cognitivists have focussed in depth on emotional aspects (see the work of Carl Plantinga, Torben Grodal or Murray Smith). Eye tracking is the perfect tool for investigating the impact of immediate audiovisual information on visual attention but it is less suitable for measuring viewer affect. Psychophysiological measures such as heart rate and skin conductance, neuroimaging methods such as fMRI or EEG, or even self-report ratings may be better for capturing a viewer’s emotional responses to a film as has been demonstrated by several research teams (Suckfull, 2000; Raz et al, 2014). Unless the emotional state of the viewer changed where they looked or how quickly they moved their eyes the eye tracker may not detect any differences between two viewers with different emotional states.[iii]

As such, a researcher interested in studying the emotional impact of a film should either choose a different measurement technique or combine eye tracking with another more suitable technique (Dyer and Pink, this issue). This does not mean that eye tracking is unsuitable for studying the cinematic experience. It simply means that you should always choose the right tool for the job and often this means combining multiple tools that are strong in different ways. As Murray Smith (the current President of the Society for the Cognitive Study of the Moving Images; SCSMI) has argued, a fully rounded investigation of the cinematic experience requires “triangulation” through the combination of multiple perspectives including psychological, neuroscientific and phenomenological/philosophical theory and methods (Smith, 2011) – an approach taken proudly across this special issue.

For the remainder of my commentary I would like to focus on certain themes that struck me as most personally relevant and interesting when reading the other articles in this special issue. This is by no means an exhaustive list of the themes raised by the other articles or even an assessment of the importance of the particular themes I chose to select. There are many other interesting observations made in the articles I do not focus on below but given my perspective as a cognitive scientist and current interests I decided to focus my commentary on these specific themes rather than make a comprehensive review of the special issues or tackle topics I am unqualified to comment on. Also, I wanted to take the opportunity to dispel some common misconceptions about eye tracking (see the section ‘Listening to the data’) and empirical methods in general.

Reading an image

One area of film cognition that has received considerable empirical investigation is subtitling. As Kruger, Szarkowska and Krejtz (this issue) so comprehensively review, they and I believe eye tracking is the perfect tool for investigating how we watch subtitled films. The presentation of subtitles divides the film viewing experience into a dual- task: reading and watching. Given that the media was originally designed to communicate critical information through two channels, the image and soundtrack introducing text as a third channel of communication places extra demands on the viewer’s visual system. However, for most competent readers serially shifting attention between these two tasks does not lead to difficulties in comprehension (Kruger, Szarkowska and Krejtz, this issue). Immediately following the presentation of the subtitles gaze will shift to the beginning of the text, saccade across the text and return to the centre of interest within a couple of seconds. Gaze heatmaps comparing the same scenes with and without subtitles (Kruger, Szarkowska and Krejtz, this issue; Fig. 3) show that the areas of the image fixated are very similar (ignoring the area of the screen occupied by the subtitles themselves) and rather than distracting from the visual content the presence of subtitles seems to actually condense the gaze behaviour on the areas of central interest in an image, e.g. faces and the centre of the image. This illustrates the redundancy of a lot of the visual information presented in films and the fact that under non-subtitle conditions viewers rarely explore the periphery of the image (Smith, 2013).

My colleague Anna Vilaró and I recently demonstrated this similarity in an eye tracking study in which the gaze behaviour of viewers was compared across versions of an animated film, Disney’s Bolt (Howard & Williams, 2008) either in the original English audio condition, a Spanish language version with English subtitles, an English language version with Spanish subtitles and a Spanish language version without subtitles (Vilaró, & Smith, 2011). Given that our participants were English speakers who did not know Spanish these conditions allowed us to investigate both where they looked under the different audio and subtitle conditions but also what they comprehended. Using cued recall tests of memory for verbal and visual content we found no significant differences in recall for either types of content across the viewing conditions except for verbal recall in the Spanish-only condition (not surprisingly given that our English participants couldn’t understand the Spanish dialogue). Analysis of the gaze behaviour showed clear evidence of subtitle reading, even in the Spanish subtitle condition (see Figure 1) but no differences in the degree to which peripheral objects were explored. This indicates that even when participants are watching film sequences without subtitles and know that their memory will be tested for the visual content their gaze still remains focussed on central features of a traditionally composed film. This supports arguments for subtitling movies over dubbing as, whilst placing greater demands on viewer gaze and a heightened cognitive load there is no evidence that subtitling leads to poorer comprehension.

Figure 1: Figure from Vilaró & Smith (2011) showing the gaze behaviour of multiple viewers directed to own language subtitles (A) and foreign language/uninterpretable subtitles (B).

Figure 1: Figure from Vilaró & Smith (2011) showing the gaze behaviour of multiple viewers directed to own language subtitles (A) and foreign language/uninterpretable subtitles (B).

The high degree of attentional synchrony (Smith and Mital, 2013) observed in the above experiment and during most film sequences indicates that all visual features in the image and areas of semantic significance (e.g. social information and objects relevant to the narrative) tend to point to the same part of the image (Mital, Smith, Hill and Henderson, 2011). Only when areas of the image are placed in conflict through image composition (e.g. depth of field, lighting, colour or motion contrast) or staging (e.g. multiple actors) does attentional synchrony break down and viewer gaze divide between multiple locations. Such shots are relatively rare in mainstream Hollywood cinema or TV (Salt, 2009; Smith, 2013) and when used the depicted action tends to be highly choreographed so attention shifts between the multiple centres of image in a predictable fashion (Smith, 2012). If such choreographing of action is not used the viewer can quickly exhaust the information in the image and start craving either new action or a cut to a new shot.

Hochberg and Brooks (1978) referred to this as the visual momentum of the image: the pace at which visual information is acquired. This momentum is directly observable in the saccadic behaviour during an images presentation with frequent short duration fixations at the beginning of a scene’s presentation interspersed by large amplitude saccades (known as the ambient phase of viewing; Velichovsky, Dornhoefer, Pannasch and Unema, 2000) and less frequent, longer duration fixations separated by smaller amplitude saccades as the presentation duration increases (known as the focal phase of viewing; Velichovsky et al., 2000). I have recently demonstrated the same pattern of fixations during viewing of dynamic scenes (Smith and Mital, 2013) and shown how this pattern gives rise to more central fixations at shot onset and greater exploration of the image and decreased attentional synchrony as the shot duration increases (Mital, Smith, Hill and Henderson, 2011). Interestingly, the introduction of subtitles to a movie may have the unintended consequence of sustaining visual momentum throughout a shot. The viewer is less likely to exhaust the information in the image because their eyes are busy saccading across the text to acquire the information that would otherwise be presented in parallel to the image via the soundtrack. This increased saccadic activity may increase the cognitive load experienced by viewers of subtitled films and change their affective experience, producing greater arousal and an increased sense of pace.

For some filmmakers and producers of dynamic visual media, increasing the visual momentum of an image sequence may be desirable as it maintains interest and attention on the screen (e.g. Michael Bay’s use of rapidly edited extreme Close-Ups and intense camera movements in the Transformer movies). In this modern age of multiple screens fighting for our attention when we are consuming moving images (e.g. mobile phones and computer screens in our living rooms and even, sadly increasingly at the cinema) if the designers of this media are to ensure that our visual attention is focussed on their screen over the other competing screens they need to design the visual display in a way that makes comprehension impossible without visual attention. Feature Films and Television dramas often rely heavily on dialogue for narrative communication and the information communicated through the image may be of secondary narrative importance to the dialogue so viewers can generally follow the story just by listening to the film rather than watching it. If producers of dynamic visual media are to draw visual attention back to the screen and away from secondary devices they need to increase the ratio of visual to verbal information. A simple way of accomplishing this is to present the critical audio information through subtitling. The more visually attentive mode of viewing afforded by watching subtitled film and TV may partly explain the growing interest in foreign TV series (at least in the UK) such as the popularity of Nordic Noir series such as The Bridge (2011) and The Killing (2007).

Another way of drawing attention back to the screen is to constantly “refresh” the visual content of the image by either increasing the editing rate or creatively using digital composition.[iv] The latter technique is wonderfully exploited by Sherlock (2010) as discussed brilliantly by Dwyer (this issue). Sherlock contemporised the detective techniques of Sherlock Holmes and John Watson by incorporating modern technologies such as the Internet and mobile phones and simultaneously updated the visual narrative techniques used to portray this information by using digital composition to playfully superimpose this information onto the photographic image. In a similar way to how the sudden appearance of traditional subtitles involuntarily captures visual attention and draws our eyes down to the start of the text, the digital inserts used in Sherlock overtly capture our eyes and encourage reading within the viewing of the image.

If Dwyer (this issue) had eyetracked viewers watching these excerpts she would have likely observed this interesting shifting between phases of reading and dynamic scene perception. Given that the appearance of the digital inserts produce sudden visual transients and are highly incongruous with the visual features of the background scene they are likely to involuntarily attract attention (Mital, Smith, Hill & Henderson, 2012). As such, they can be creatively used to reinvigorate the pace of viewing and strategically direct visual attention to parts of the image away from the screen centre. Traditionally, the same content may have been presented either verbally as narration, heavy handed dialogue exposition (e.g. “Oh my! I have just received a text message stating….”) or as a slow and laboured cut to close-up of the actual mobile phone so we can read it from the perspective of the character. Neither takes full advantage of the communicative potential of the whole screen space or our ability to rapidly attend to and comprehend visual information and audio information in parallel.

Such intermixing of text, digital inserts and filmed footage is common in advertisements, music videos, and documentaries (see Figure 2) but is still surprisingly rare in mainstream Western film and TV. Short-form audiovisual messages have recently experienced a massive increase in popularity due to the internet and direct streaming to smartphones and mobile devices. To maximise their communicative potential and increase their likelihood of being “shared” these videos use all audiovisual tricks available to them. Text, animations, digital effects, audio and classic filmed footage all mix together on the screen, packing every frame with as much info as possible (Figure 2), essentially maximising the visual momentum of each video and maintaining interest for as long as possible.[v] Such videos are so effective at grabbing attention and delivering satisfying/entertaining/informative experiences in a short period of time that they often compete directly with TV and film for our attention. Once we click play, the audiovisual bombardment ensures that our attention remains latched on to the second screen (i.e., the tablet or smartphone) for its duration and away from the primary screen, i.e., the TV set. Whilst distressing for producers of TV and Film who wish our experience of their material to be undistracted, the ease with which we pick up a handheld device and seek other stimulation in parallel to the primary experience may indicate that the primary material does not require our full attention for us to follow what is going on. As attention has a natural ebb-and-flow (Cutting, DeLong and Nothelfer, 2010) and “There is no such thing as voluntary attention sustained for more than a few seconds at a time” (p. 421; James, 1890) if modern producers of Film and TV want to maintain a high level of audience attention and ensure it is directed to the screen they must either rely on viewer self-discipline to inhibit distraction, reward attention to the screen with rich and nuanced visual information (as fans of “slow cinema” would argue of films like those of Bela Tarr) or utilise the full range of postproduction effects to keep visual interest high and maintained on the image, as Sherlock so masterfully demonstrates.

Figure 2: Gaze Heatmaps of participants’ free-viewing a trailer for Lego Indiana Jones computer game (left column) and the Video Republic documentary (right column). Notice how both make copious use of text within the image, as intertitles and as extra sources of information in the image (such as the head-up display in A3). Data and images were taken from the Dynamic Images and Eye Movement project (DIEM; Mital, Smith, Hill & Henderson, 2010). Videos can be found here ( and here (

Figure 2: Gaze Heatmaps of participants’ free-viewing a trailer for Lego Indiana Jones computer game (left column) and the Video Republic documentary (right column). Notice how both make copious use of text within the image, as intertitles and as extra sources of information in the image (such as the head-up display in A3). Data and images were taken from the Dynamic Images and Eye Movement project (DIEM; Mital, Smith, Hill & Henderson, 2010). Videos can be found here ( and here (

A number of modern filmmakers are beginning to experiment with the language of visual storytelling by questioning our assumptions of how we perceive moving images. Forefront in this movement are Ang Lee and Andy and Lana Wachowski. In Ang Lee’s Hulk (2003), Lee worked very closely with editor Tim Squyers to use non-linear digital editing and after effects to break apart the traditional frame and shot boundaries and create an approximation of a comic book style within film. This chaotic unpredictable style polarised viewers and was partly blamed for the film’s poor reception. However, it cannot be argued that this experiment was wholly unsuccessful. Several sequences within the film used multiple frames, split screens, and digital transformation of images to increase the amount of centres of interest on the screen and, as a consequence increase pace of viewing and the arousal experienced by viewers. In the sequence depicted below (Figure 3) two parallel scenes depicting Hulk’s escape from a containment chamber (A1) and this action being watched from a control room by General Ross (B1) were presented simultaneously by presenting elements of both scenes on the screen at the same time. Instead of using a point of view (POV) shot to show Ross looking off screen (known as the glance shot; Branigan, 1984) followed by a cut to what he was looking at (the object shot) both shots were combined into one image (F1 and F2) with the latter shot sliding into from behind Ross’ head (E2). These digital inserts float within the frame, often gliding behind objects or suddenly enlarging to fill the screen (A2-B2). Such visual activity and use of shots-within-shots makes viewer gaze highly active (notice how the gaze heatmap is rarely clustered in one place; Figure 3). Note that this method of embedding a POV object shot within a glance shot is similar to Sherlock’s method of displaying text messages as both the glance, i.e., Watson looking at his phone, and the object, i.e., the message, are shown in one image. Both uses take full advantage of our ability to rapidly switch from watching action to reading text without having to wait for a cut to give us the information.

Figure 3: Gaze heatmap of eight participants watching a series of shots and digital inserts from Hulk (Ang Lee, 2003). Full heatmap video is available at

Figure 3: Gaze heatmap of eight participants watching a series of shots and digital inserts from Hulk (Ang Lee, 2003). Full heatmap video is available at

Similar techniques have been used Andy and Lana Wachowski’s films including most audaciously in Speed Racer (2008). Interestingly, both sets of filmmakers seem to intuitively understand that packing an image with as much visual and textual information as possible can lead to viewer fatigue and so they limit such intense periods to only a few minutes and separate them with more traditionally composed sequences (typically shot/reverse-shot dialogue sequences). These filmmakers have also demonstrated similar respect for viewer attention and the difficulty in actively locating and encoding visual information in a complex visual composition in their more recent 3D movies. Ang Lee’s Life of Pi (2012) uses the visual volume created by stereoscopic presentation to its full potential. Characters inhabit layers within the volume as foreground and background objects fluidly slide around each other within this space. The lessons Lee and his editor Tim Squyers learned on Hulk (2003) clearly informed the decisions they made when tackling their first 3D film and allowed them to avoid some of the issues most 3D films experience such as eye strain, sudden unexpected shifts in depth and an inability to ensure viewers are attending to the part of the image easiest to fuse across the two eye images (Banks, Read, Allison & Watt, 2012).

Watching Audio

I now turn to another topic featured in this special issue, the influence of audio on gaze (Robinson, Stadler and Rassell, this issue). Film and TV are inherently multimodal. Both media have always existed as a combination of visual and audio information. Even early silent film was almost always presented with either live musical accompaniment or a narrator. As such, the relative lack of empirical investigation into how the combination of audio and visual input influences how we perceive movies and, specifically how we attend to them is surprising. Robinson, Stadler and Rassell (this issue) have attempted to address this omission by comparing eye movements for participants either watching the original version of the Omaha beach sequence from Steven Spielberg’s Saving Private Ryan (1998) or the same sequence with the sound removed. This film sequence is a great choice for investigating AV influences on viewer experience as the intensity of the action, the hand-held cinematography and the immersive soundscape all work together to create a disorientating embodied experience for the viewer. The authors could have approached this question by simply showing a set of participants the sequence with audio and qualitatively describing the gaze behaviour at interesting AV moments during the sequence. Such description of the data would have served as inspiration for further investigation but in itself can’t say anything about the causal contribution of audio to this behaviour as there would be nothing to compare the behaviour to. Thankfully, the authors avoided this problem by choosing to manipulate the audio.

In order to identify the causal contribution of any factor you need to design an experiment in which that factor (known as the Independent Variable) is either removed or manipulated and the significant impact of this manipulation on the behaviour of interest (known as the Dependent Variable) is tested using appropriate inferential statistics. I commend Robinson, Stadler and Rassell’s experimental design as they present such an manipulation and are therefore able to produce data that will allow them to test their hypotheses about the causal impact of audio on viewer gaze behaviour. Several other papers in this special issue (Redmond, Sita and Vincs; Batty, Perkins and Sita) discuss gaze data (typically in the form of scanpaths or heatmaps) from one viewing condition without quantifying its difference to another viewing condition. As such, they are only able to describe the gaze data, not use it to test hypotheses. There is always a temptation to attribute too much meaning to a gaze heatmap (I too am guilty of this; Smith, 2013) due to their seeming intuitive nature (i.e., they looked here and not there) but, as in all psychological measures they are only as good as the experimental design within which there are employed.[vi]

Qualitative interpretation of individual fixation locations, scanpaths or group heatmaps are useful for informing initial interpretation of which visual details are most likely to make it into later visual processing (e.g. perception, encoding and long term memory representations) but care has to be taken in falsely assuming that fixation equals awareness (Smith, Lamont and Henderson, 2012). Also, the visual form of gaze heatmaps vary widely depending on how many participants contribute to the heatmap, which parameters you choose to generate the heatmaps and which oculomotor measures the heatmap represent (Holmqvist, et al., 2011). For example, I have demonstrated that unlike during reading visual encoding during scene perception requires over 150ms during each fixation (Rayner, Smith, Malcolm and Henderson, 2009). This means that if fixations with durations less than 150ms are included in a heatmap it may suggest parts of the image have been processed which in actual fact were fixated too briefly to be processed adequately. Similarly, heatmaps representing fixation duration instead of just fixation location have been shown to be a better representation of visual processing (Henderson, 2003). Heatmaps have an immediate allure but care has to be taken about imposing too much meaning on them especially when the gaze and the image are changing over time (see Smith and Mital, 2013; and Sawahata et al, 2008 for further discussion). As eye tracking hardware becomes more available to researchers from across a range of disciplines we need to work harder to ensure that it is not used inappropriately and that the conclusions that are drawn from eye tracking data are theoretically and statistically motivated (see Rayner, 1998; and Holmqvist et al, 2013 for clear guidance on how to conduct sound eye tracking studies).

Given that Robinson, Stadler and Rassell (this issue) manipulated the critical factor, i.e., the presence of audio the question now is whether their study tells us anything new about the AV influences on gaze during film viewing. To examine the influence of audio they chose two traditional methods for expressing the gaze data: area of interest (AOI) analysis and dispersal. By using nine static (relative to the screen) AOIs they were able to quantify how much time the gaze spent in each AOI and utilise this measure to work out how distributed gaze was across all AOIs. Using these measures they reported a trend towards greater dispersal in the mute condition compared to the audio condition and a small number of significant differences in the amount of time spent in some regions across the audio conditions.

However, the conclusions we can draw from these findings are seriously hindered by the low sample size (only four participants were tested, meaning that any statistical test is unlikely to reveal significant differences) and the static AOIs that did not move with the image content. By locking the AOIs to static screen coordinates their AOI measures express the deviation of gaze relative to these coordinates, not to the image content. This approach can be informative for quantifying gaze exploration away from the screen centre (Mital, Smith, Hill and Henderson, 2011) but in order to draw conclusions about what was being fixated the gaze needs to be quantified relative to dynamic AOIs that track objects of interest on the screen (see Smith an Mital, 2013). For example, their question about whether we fixate a speaker’s mouth more in scenes where the clarity of the speech is difficult due to background noise (i.e., their “Indistinct Dialogue” scene) has previously been investigated in studies that have manipulated the presence of audio (Võ, Smith, Mital and Henderson, 2012) or the level of background noise (Buchan, Paré and Munhall, 2007) and measured gaze to dynamic mouth regions. As Robinson, Stadler and Rassell correctly predicted, lip reading increases as speech becomes less distinct or the listener’s linguistic competence in the spoken language decreases (see Võ et al, 2012 for review).

Similarly, by measuring gaze dispersal using a limited number of static AOIs they are losing considerable nuance in the gaze data and have to resort to qualitative description of unintuitive bar charts (figure 4). There exist several methods for quantifying gaze dispersal (see Smith and Mital, 2013, for review) and even open-source tools for calculating this measure and comparing dispersal across groups (Le Meur and Baccino, 2013). Some methods are as easy, if not easier to calculate than the static AOIs used in the present study. For example, the Euclidean distance between the screen centre and the x/y gaze coordinates at each frame of the movie provides a rough measure of how spread out the gaze is from the screen centre (typically the default viewing location; Mital et al, 2011) and a similar calculation can be performed between the gaze position of all participants within a viewing condition to get a measure of group dispersal.

Using such measures, Coutrot and colleagues (2012) showed that gaze dispersal is greater when you remove audio from dialogue film sequences and they have also observed shorter amplitude saccades and marginally shorter fixation durations. Although, I have recently shown that a non-dialogue sequence from Sergei Eisenstein’s Alexander Nevsky (1938) does not show significant differences in eye movement metrics when the accompanying music is removed (Smith, 2014). This difference in findings points towards interesting differences in the impact diegetic (within the depicted scene, e.g. dialogue) and non-diegetic (outside of the depicted scene, e.g. the musical score) may have on gaze guidance. It also highlights how some cinematic features may have a greater impact on other aspects of a viewer’s experience than those measureable by eye tracking such as physiological markers of arousal and emotional states. This is also the conclusion that Robinson, Stadler and Rassell come to.    

Listening to the Data (aka, What is Eye Tracking Good For?)

The methodological concerns I have raised in the previous section lead nicely to the article by William Brown, entitled There’s no I in Eye Tracking: How useful is Eye Tracking to Film Studies (this issue). I have known William Brown for several years through our attendance of the Society for Cognitive Studies of the Moving Image (SCSMI) annual conference and I have a deep respect for his philosophical approach to film and his ability to incorporate empirical findings from the cognitive neurosciences, including some references to my own work into his theories. Therefore, it comes somewhat as a surprise that his article openly attacks the application of eye tracking to film studies. However, I welcome Brown’s criticisms as it provides me with an opportunity to address some general assumptions about the scientific investigation of film and hopefully suggest future directions in which eye tracking research can avoid falling into some of the pitfalls Brown identifies.

Brown’s main criticisms of current eye tracking research are: 1) eye tracking studies neglect “marginal” viewers or marginal ways of watching movies; 2) studies so far have neglected “marginal” films; 3) they only provide “truisms”, i.e., already known facts; and 4) they have an implicit political agenda to argue that the only “true” way to study film is a scientific approach and the “best” way to make a film is to ensure homogeneity of viewer experience. I will address these criticisms in turn but before I do so I would like to state that a lot of Brown’s arguments could generally be recast as an argument against science in general and are built upon a misunderstanding of how scientific studies should be conducted and what they mean.

To respond to Brown’s first criticism that eye tracking “has up until now been limited somewhat by its emphasis on statistical significance – or, put simply, by its emphasis on telling us what most viewers look at when they watch films” (Brown, this issue; 1), I first have to subdivide the criticism into ‘the search for significance’ and ‘attentional synchrony’, i.e., how similar gaze is across viewers (Smith and Mital, 2013). Brown tells an anecdote about a Dutch film scholar who’s data had to be excluded from an eye tracking study because they did not look where the experimenter wanted them to look. I wholeheartedly agree with Brown that this sounds like a bad study as data should never be excluded for subjective reasons such as not supporting the hypothesis, i.e., looking as predicted. However, exclusion due to statistical reasons is valid if the research question being tested relates to how representative the behaviour of a small set of participants (known as the sample) are to the overall population. To explain when such a decision is valid and to respond to Brown’s criticism about only ‘searching for significance’ I will first need to provide a brief overview of how empirical eye tracking studies are designed and why significance testing is important.

For example, if we were interested in the impact sound had on the probability of fixating an actor’s mouth (e.g., Robinson, Stadler and Rassell, this issue) we would need to compare the gaze behaviour of a sample of participants who watch a sequence with the sound turned on to a sample who watched it with the sound turned off. By comparing the behaviour between these two groups using inferential statistics we are testing the likelihood that these two viewing conditions would differ in a population of all viewers given the variation within and between these two groups. In actual fact we do this by performing the opposite test: testing the probability that that the two groups belong to a single statistically indistinguishable group. This is known as the null hypothesis. By showing that there is less than a 5% chance that the null hypothesis is true we can conclude that there is a statistically significant chance that another sample of participants presented with the same two viewing conditions would show similar differences in viewing behaviour.

In order to test whether our two viewing conditions belong to one or two distributions we need to be able to express this distribution. This is typically done by identifying the mean score for each participant on the dependent variable of interest, in this case the probability of fixating a dynamic mouth AOI then calculating the mean for this measure across all participants within a group and their variation in scores (known as the standard deviation). Most natural measures produce a distribution of scores looking somewhat like a bell curve (known as the normal distribution) with most observations near the centre of the distribution and an ever decreasing number of observations as you move away from this central score. Each observation (in our case, participants) can be expressed relative to this distribution by subtracting the mean of the distribution from its score and dividing by the standard deviation. This converts a raw score into a normalized or z-score. Roughly ninety-five percent of all observations will fall within two standard deviations of the mean for normally distributed data. This means that observations with a z-score greater than two are highly unrepresentative of that distribution and may be considered outliers.

However, being unrepresentative of the group mean is insufficient motivation to exclude a participant. The outlier still belongs to the group distribution and should be included unless there is a supporting reason for exclusion such as measurement error, e.g. poor calibration of the eye tracker. If an extreme outlier is not excluded it can often have a disproportionate impact on the group mean and make statistical comparison of groups difficult. However, if this is the case it suggests that the sample size is too small and not representative of the overall population. Correct choice of sample size given an estimate of the predicted effect size combined with minimising measurement error should mean that subjective decisions do not have to be made about who’s data is “right” and who should be included or excluded.

Brown also believes that eye tracking research has so far marginalised viewers who have atypical ways of watching film, such as film scholars either by not studying them or treating them as statistical outliers and excluding them from analyses. However, I would argue that the only way to know if their way of watching a film is atypical is to first map out the distribution of how viewers typically watch films. If a viewer attended more to the screen edge than the majority of other viewers in a random sample of the population (as was the case with Brown’s film scholar colleague) this should show up as a large z-score when their gaze data is expressed relative to the group on a suitable measure such as Euclidean distance from the screen centre. Similarly, a non-native speaker of English may have appeared as an outlier in terms of how much time they spent looking at the speaker’s mouth in Robinson, Stadler and Rassell’s (this issue) study. Such idiosyncrasies may be of interest to researchers and there are statistical methods for expressing emergent groupings within the data (e.g. cluster analysis) or seeing whether group membership predicts behaviour (e.g. regression). These approaches may have not previously been applied to questions of film viewing but this is simply due to the immaturity of the field and the limited availability of the equipment or expertise to conduct such studies.

In my own recent work I have shown how viewing task influences how we watch unedited video clips (Smith and Mital, 2013), how infants watch TV (Wass and Smith, in press), how infant gaze differs to adult gaze (Smith, Dekker, Mital, Saez De Urabain and Karmiloff-Smith, in prep) and even how film scholars attend to and remember a short film compared to non-expert film viewers (Smith and Smith, in prep). Such group viewing differences are of great interest to me and I hope these studies illustrate how eye tracking has a lot to offer to such research questions if the right statistics and experimental designs are employed.

Brown’s second main criticism is that the field of eye tracking neglects “marginal” films. I agree that the majority of films that have so far been used in eye tracking studies could be considered mainstream. For example, the film/TV clips used in this special issue include Sherlock (2010), Up (2009) and Saving Private Ryan (1998). However, this limit is simply a sign of how few eye tracking studies of moving images there have been. All research areas take time to fully explore the range of possible research questions within that area.

I have always employed a range of films from diverse film traditions, cultures, and languages. My first published eye tracking study (Smith and Henderson, 2008) used film clips from Citizen Kane (1941), Dogville (2003), October (1928), Requiem for a Dream (2000), Dancer in the Dark (2000), Koyaanisqatsi (1982) and Blade Runner (1982). Several of these films may be considered “marginal” relative to the mainstream. If I have chosen to focus most of my analyses on mainstream Hollywood cinema this is only because they were the most suitable exemplars of the phenomena I was investigating such as continuity editing and its creation of a universal pattern of viewing (Smith, 2006; 2012). This interest is not because, as Brown argues, I have a hidden political agenda or an implicit belief that this style of filmmaking is the “right” way to make films. I am interested in this style because it is the dominant style and, as a cognitive scientist I wish to use film as a way of understanding how most people process audiovisual dynamic scenes.

Hollywood film stands as a wonderfully rich example of what filmmakers think “fits” human cognition. By testing filmmaker intuitions and seeing what impact particular compositional decisions have on viewer eye movements and behavioural responses I hope to gain greater insight into how audiovisual perception operates in non-mediated situations (Smith, Levin and Cutting, 2012). But, just as a neuropsychologist can learn about typical brain function by studying patients with pathologies such as lesions and strokes, I can also learn about how we perceive a “typical” film by studying how we watch experimental or innovative films. My previous work is testament to this interest (Smith, 2006; 2012a; 2012b; 2014; Smith & Henderson, 2008) and I hope to continue finding intriguing films to study and further my understanding of film cognition.

One practical reason why eye tracking studies rarely use foreign language films is the presence of subtitles. As has been comprehensively demonstrated by other authors in this special issue (Kruger, Szarkowska and Krejtz, this issue) and earlier in this article, the sudden appearance of text on the screen, even if it is incomprehensible leads to differences in eye movement behaviour. This invalidates the use of eye tracking as a way to measure how the filmmaker intended to shape viewer attention and perception. The alternatives would be to either use silent film (an approach I employed with October; Smith and Henderson, 2008), remove the audio (which changes gaze behaviour and awareness of editing; Smith & Martin-Portugues Santacreau, under review) or use dubbing (which can bias the gaze down to the poorly synched lips; Smith, Batten, and Bedford, 2014). None of these options are ideal for investigating foreign language sound film and until there is a suitable methodological solution this will restrict eye tracking studies to experimental films in a participant’s native language.

Finally, I would like to counter Brown’s assertion that eye tracking investigations of film have so far only generated “truisms”. I admit that there is often a temptation to reduce empirical findings to simplified take-home messages that only seem to confirm previous intuitions such as a bias of gaze towards the screen centre, towards speaking faces, moving objects or subtitles. However, I would argue that such messages fail to appreciate the nuance in the data. Empirical data correctly measured and analysed can provide subtle insights into a phenomenon that subjective introspection could never supply.

For example, film editors believe that an impression of continuous action can be created across a cut by overlapping somewhere between two (Anderson, 1996) and four frames (Dmytryk, 1986) of the action. However, psychological investigations of time perception revealed that our judgements of duration depend on how attention is allocated during the estimated period (Zakay and Block, 1996) and will vary depending on whether our eyes remain still or saccade during the period (Yarrow et al, 2001). In my thesis (Smith, 2006) I used simplified film stimuli to investigate the role that visual attention played in estimation of temporal continuity across a cut and found that participants experienced an overlap of 58.44ms as continuous when an unexpected cut occurred during fixation and an omission of 43.63ms as continuous when they performed a saccade in response to the cut. As different cuts may result in different degrees of overt (i.e., eye movements) and covert attentional shifts these empirical findings both support editor intuitions that temporal continuity varies between cuts (Dmytryk, 1986) whilst also explaining the factors that are important in influencing time perception at a level of precision not possible through introspection.

Reflecting on our own experience of a film suffers from the fact that it relies on our own senses and cognitive abilities to identify, interpret and express what we experience. I may feel that my experience of a dialogue sequence from Antichrist (2010) differs radically from a similar sequence from Secrets & Lies (1996) but I would be unable to attribute these differences to different aspects of the two scenes without quantifying both the cinematic features and my responses to them. Without isolating individual features I cannot know their causal contribution to my experience. Was it the rapid camera movements in Antichrist, the temporally incongruous editing, the emotionally extreme dialogue or the combination of these features that made me feel so unsettled whilst watching the scene? If one is not interested in understanding the causal contributions of each cinematic decision to an audience member’s response then one may be content with informed introspection and not find empirical hypothesis testing the right method. I make no judgement about the validity of either approach as long as each researcher understands the limits of their approach.

Introspection utilises the imprecise measurement tool that is the human brain and is therefore subject to distortion, human bias and an inability to extrapolate the subjective experience of one person to another. Empirical hypothesis testing also has its limitations: research questions have to be clearly formulated so that hypotheses can be stated in a way that allows them to be statistically tested using appropriate observable and reliable measurements. A failure at any of these stages can invalidate the conclusions that can be drawn from the data. For example, an eye tracker may be poorly calibrated resulting in an inaccurate record of where somebody was looking or it could be used to test an ill formed hypothesis such as how a particular film sequence caused attentional synchrony without having another film sequence to compare the gaze data to. Each approach has its strength and weaknesses and no single approach should be considered “better” than any other, just as no film should be considered “better” than any other film.


The articles collected here constitute the first attempt to bring together interdisciplinary perspectives on the application of eye tracking to film studies. I fully commend the intention of this special issue and hope that it encourages future researchers to conduct further studies using these methods to investigate research questions and film experiences we have not even conceived of. However, given that the recent release of low-cost eye tracking peripherals such as the EyeTribe[vii] tracker and the Tobii EyeX[viii] has moved eye tracking from a niche and highly expensive research tool to an accessible option for researchers in a range of disciplines, I need to take this opportunity to issue a word of warning. As I have outlined in this article, eye tracking is like any other research tool in that it is only useful if used correctly, its limitations are respected, its data is interpreted through the appropriate application of statistics and conclusions are only drawn that are based on the data in combination with a sound theoretical base. Eye tracking is not the “saviour” of film studies , nor is science the only “valid” way to investigate somebody’s experience of a film. Hopefully, the articles in this special issue and the ideas I have put forward here suggest how eye tracking can function within an interdisciplinary approach to film analysis that furthers our appreciation of film in previously unfathomed ways.



Thanks to Rachael Bedford, Sean Redmond and Craig Batty for comments on earlier drafts of this article. Thank you to John Henderson, Parag Mital and Robin Hill for help in gathering and visualising the eye movement data used in the Figures presented here. Their work was part of the DIEM Leverhulme Trust funded project ( The author, Tim Smith is funded by EPSRC (EP/K012428/1), Leverhulme Trust (PLP-2013-028) and BIAL Foundation grant (224/12).



[ii] An alternative take on eye tracking data is to divorce the data itself from psychological interpretation. Instead of viewing a gaze point as an index of where a viewer’s overt attention is focussed and a record of the visual input most likely to be encoded into the viewer’s long-term experience of the media, researchers can instead take a qualitative, or even aesthetic approach to the data. The gaze point becomes a trace of some aspect of the viewer’s engagement with the film. The patterns of gaze, its movements across the screen and the coordination/disagreement between viewers can inform qualitative interpretation without recourse to visual cognition. Such an approach is evident in several of the articles in this special issue (including Redmond, Sita, and Vincs, this issue; Batty, Perkins, and Sita, this issue). This approach can be interesting and important for stimulating hypotheses about how such patterns of viewing have come about and may be a satisfying endpoint for some disciplinary approaches to film. However, if researchers are interested in testing these hypotheses further empirical manipulation of the factors that are believed to be important and statistical testing would be required. During such investigation current theories about what eye movements are and how they relate to cognition must also be respected.

[iii] Although, one promising area of research is the use of pupil diameter changes as an index of arousal (Bradley, Miccoli, Escrig and Lang, 2008).

[iv] This technique has been used for decades by producers of TV advertisements and by some “pop” serials such as Hollyoaks in the UK (Thanks for Craig Batty for this observation).

[v] This trend in increasing pace and visual complexity of film is confirmed by statistical analyses of film corpora over time (Cutting, DeLong and Nothelfer, 2010) and has resulted in a backlash and increasing interest in “slow cinema”.

[vi] Other authors in this special issue may argue that taking a critical approach to gaze heatmaps without recourse to psychology allows them to embed eye tracking within their existing theoretical framework (such as hermeneutics). However, I would warn that eye tracking data is simply a record of how a relatively arbitrary piece of machinery (the eye tracking hardware) and associated software decided to represent the centre of a viewer’s gaze. There are numerous parameters that can be tweaked to massively alter how such gaze traces and heatmaps appear. Without understanding the psychology and the physiology of the human eye a researcher cannot know how to set these parameters, how much to trust the equipment they are using, or the data it is recording and as a consequence may over attribute interpretation to a representation that is not reliable.

[vii] (accessed 13/12/14). The EyeTribe tracker is $99 and is as spatially and temporally accurate (up to 60Hz sampling rate) as some science-grade trackers.

[viii] (accessed 13/12/14). The Tobii EyeX tracker is $139, samples at 30Hz and is as spatially accurate as the EyeTribe although the EyeX does not give you as much access to the raw gaze data (e.g., pupil size and binocular gaze coordinates) as the EyeTribe.



Dr Tim J. Smith is a senior lecturer in the Department of Psychological Sciences at Birkbeck, University of London. He applies empirical Cognitive Psychology methods including eye tracking to questions of Film Cognition and has published extensively on the subject both in Psychology and Film journals.


Politicizing Eye tracking Studies of Film – William Brown


This essay puts eye tracking studies of cinema into contact with film theory, or what I term film-philosophy, so as to distinguish film theory from specifically cognitive film theory. Looking at the concept of attention, the essay explains how winning and keeping viewers’ attention in a synchronous fashion is understood by eye tracking studies of cinema as key to success in filmmaking, while film-philosophy considers the winning and keeping of attention by cinema to be a political issue driven by economics and underscored by issues of control. As such, film-philosophy understands cinema as political, even if eye tracking studies of film tend to avoid engagement in political debate. Nonetheless, the essay identifies political dimensions in eye tracking film studies: the legitimization of the approach, its emphasis on mainstream cinema as an object of study and its emphasis on statistical significance all potentially have political connotations/ramifications. Invoking the concept of cinephilia, the essay then suggests that idiosyncratic viewer responses, as well as films that do not synchronously capture attention, might yield important results/play an important role in life in an attention-driven society.

In this essay, I wish to put eye tracking studies of film into dialogue with a more political approach to film, drawn from film theory, or what, for the benefit of distinguishing film theory from cognitive film theory, I shall term film-philosophy. In doing so, I shall draw out what for film-philosophy are some of the limitations of eye tracking, including its emphasis on statistical significance, or what most viewers look at when they watch films. I shall argue that we might learn as much, if not more, about cinema by paying attention not only to statistically significant and shared responses to films (what most viewers look at), but also to those viewers whose responses to a film do not form part of the statistically significant group, and/or to films that may not induce in viewers statistically significant and shared responses. In effect, we may find that there are insights to be derived from those who look at the margins of the cinematic image, rather than at the centre, even if those viewers are themselves ‘marginal’ in the sense that they are pushed to the margins of most/all eye tracking studies of film viewers. There is perhaps also value to be found in looking at ‘marginal’ films. In this way, we might find that idiosyncratic responses to a film or films is as important as the shared response. I shall also argue that there is a politics to the idiosyncratic response, especially when it is put into dialogue with film theoretical/film-philosophical work on cinephilia, and that as a result there is also a politics to eye tracking and its emphasis on statistical significance. I shall start, however, by looking at the state of eye tracking film research today.

On 29 and 30 July 2014, the Academy of Motion Picture Arts and Sciences (AMPAS) – the same American academy that distributes the so-called Oscars – held two events under the combined title of ‘Movies in Your Brain: The Science of Cinematic Perception’. The events included contributions from neuroscientists Uri Hasson, Talma Hendler and Jeffrey M. Zacks, psychologist James E. Cutting, directors Darren Aronofsky and Jon Favreau, editor Walter Murch and writer-producer Ari Handel. The host of the first evening was psychologist Tim J. Smith, whose eye tracking studies of cinema have arguably become the best known and most influential over recent years (see, inter alia, Smith 2012a; Smith 2013; Smith 2014). Through these events, as well as through coverage of these events in fashionable journals like Wired (Miller 2014a; Miller 2014b), we can see how eye tracking – together with the study of film using brain scanning technologies such as functional Magnetic Resonance Imaging (fMRI) – is clearly becoming important for our understanding of how films work. This in turn means that such studies are surely important to film studies.

For a detailed history and overview of eye tracking, explaining how it works and what it tells us about film, I cannot do better than to guide readers to the afore-mentioned work by Smith. Smith has soundly demonstrated, and with great clarity, how the human eye moves via small movements called saccades, and that in between saccades the human eye fixates. It is during fixations that humans take in visual information, with fixations being linked therefore to attention and to working memory; we tend to remember objects from our visual field upon which we have fixated, or to which we have paid attention. Clearly this is important to the study of film, since viewers typically attend only to parts of the movie screen at any given time, and not necessarily to others or to the whole of the screen (and the surrounding auditorium). Can/do filmmakers exert influence over where we look, for how long, and thus what we remember about a film – with those memories themselves lasting for greater or lesser periods of time? And if filmmakers do influence such things, how much influence do they exert and through which techniques? These are the questions that eye tracking technology can help to answer – and scholars like Smith do so with great skill and eloquence.

My aim, however, is not simply to reproduce findings by Smith and others who have used eye tracking devices to study film. In order to construct a theoretical argument concerning the importance of the idiosyncratic, or ‘cinephilic’, response to a film or films in general, as well as the importance of a filmmaker not necessarily ‘controlling’ where a viewer looks, but instead allowing/encouraging viewers precisely to look idiosyncratically, cinephilically, or where they wish, I need instead to bring the scientific and ‘apolitical’ use of eye tracking devices into a political discourse concerning the nature of cinema, power, hegemony and the issue of cinematic homogeneity and/or heterogeneity. This is a controversial maneuver – in that it will bring together two areas of film studies that often seem to stand in ‘opposition’ to each other, namely cognitive film theory and a film theory that still plies its trade using Continental philosophy, or what for the sake of simplicity I shall term film-philosophy. My desire is not simply to be controversial, however. Rather it is to engage with what eye tracking means to film studies, both currently and potentially in the future.

To begin to bring eye tracking studies of film into the ‘political discourse’ mentioned above, I shall relate an anecdote. A semi-regular response from colleagues in film studies, when I tell them about eye tracking studies of film viewing, is that eye tracking doesn’t tell us anything about films that we didn’t already know. Is it a surprise that we tend to look more often at the center of the screen? Is it a surprise that we typically attend more to brightly illuminated parts of the screen than to dimly lit ones? Is it a surprise that we tend to direct our attention towards human faces when watching a film that features human characters? Anyone who has consciously thought about what they do while watching a film will be able to tell from memory alone that these things are all true. As a result, eye tracking studies of film can sometimes be filled with what, at least to the film student/scholar, are truisms. By way of an example, Paul Marchant and colleagues say that ‘these strategies and techniques… [capture] the audience’s visual attention: focus, camera movement, eye line match, color and contrast, motion of elements within the shot, graphic matching’ (Marchant et al. 2009, 158). On my print-out of Marchant et al.’s essay, my own apostil next to this assertion reads as follows: ‘Do we not know this already (otherwise cinema would not have developed these techniques)?’ Many, if not all, film viewers will know simply from experience that these techniques help to guide their attention, even if they are blissfully unaware of the relationship between eye fixations, attention and memory. Of course, it is pleasing to have our introspective responses to/our intuitive knowledge about cinema ‘scientifically’ confirmed (to a large extent, but not entirely – about which, more later); but essentially, so my colleagues’ argument goes, eye tracking studies tell us what we already know.

Now, even if I myself find some eye tracking studies of film to be ‘truistic’, I nonetheless believe that eye tracking studies of film are of great importance. However, their importance is perhaps in playing a role that is different from the one that eye tracking studies of film seem to give to themselves, which is as a key component of cognitive film theory. Instead, I think that eye tracking studies of film are important for film theory, or what today is termed film-philosophy. I shall explain the distinction between cognitive film theory and film-philosophy presently.

Little in this world is uniform, and so by definition I generalize when I say that the basic tenet of cognitive film theory is, with David Bordwell and Noël Carroll’s Post Theory: Reconstructing Film Studies (1996) serving as its figurehead, for film studies to move towards a theory of cinema based on the analysis of films themselves, and away from a film theory that uses cinema as a means of confirming or denying a Lacanian understanding of the human and/or an Althusserian/Marxist conception of contemporary capital. In spite of cognitive film theory’s lack of uniformity, eye tracking studies of film are nonetheless part of cognitive film theory’s project to help us to look at cinema ‘as it is’, and not to use cinema as a political football. Conversely, film-philosophy is in general informed by the kinds of Continental philosophers, often though not limited to Gilles Deleuze, that cognitive film theorists reject, and it engages not just with films ‘as they are’, but with the politics of films.

Now, to claim that we can isolate films and film viewing from a human world that is perhaps always political, and to claim that we can then analyse films ‘as they are’, is perhaps absurd: films ‘as they are’ are part of a political world, and cognitive film theorists are not unaware of this, just as film-philosophers are not incapable of scientific analysis. However, how much politics is allowed into the analysis of films perhaps informs the broad distinction between cognitive film theory and film-philosophy, as I hope to clarify by looking briefly at the role of attention in the work of two scholars, Tim J. Smith and Jonathan Beller. In his ‘Attentional Theory of Cinematic Continuity’ (AToCC), Smith (2012a) uses eye tracking studies to demonstrate how filmmakers capture and maintain viewers’ attention, with certain techniques, mainly those associated with continuity editing, being more successful than others. Meanwhile, in his Cinematic Mode of Production: Attention Economy and the Society of the Spectacle, Beller (2006) suggests that capturing attention is not necessarily an aesthetic, but rather a political project: the more attention a film garners, the more success one will have in monetizing that film, with the making of money becoming the bottom line of cinema. Beller does not appeal to some early cinema that did not attempt to elicit viewers’ attention and thus make money; such an early cinema did not necessarily exist. Rather, Beller argues that cinema has always been part of an economy that is based on attention; indeed, cinema plays a key role in naturalizing this attention economy, meaning that cinema has not always been necessarily capitalist, but that the capitalist world endeavors as much as possible to become cinematic, to capture our attention as much as possible in order to ‘win’ the economic race, since capturing eyeballs means making money. Smith explains how attention is captured; Beller offers an explanation as to why. Even though filmmakers rely on natural processes in order to capture attention (Smith), the process of consistently trying to capture our attention (‘cinema’) is not natural, but political and economic (Beller).

James E. Cutting, in commenting on an earlier draft of this paper, says that the results of eye tracking studies of film, which reveal how filmmakers capture attention, are

big news… because almost nothing else does this – not static pictures (photographs, artworks), not class room behavior by teachers, not leaders of business meetings, and often not even spectacles of various kinds (sporting events, rock concerts, etc); even TV is typically not as good as the average narrative, popular movie. (Cutting, signed peer review 2014)

If cinema is indeed better at capturing our attention than these other media, and if in some senses it is better at capturing our attention than those parts of the world that do not feature such media – i.e. if cinema is better at capturing our attention than reality – then cinema, and the making-cinematic of reality in a bid to capture attention, to make money and/or to influence people (Cutting compares cinema in particular to teachers and to business leaders) is profoundly political. It is profoundly political because learning about how to capture attention – learning about how cinema works – is tied to the shaping of our material reality (putting screens everywhere) and to controlling attention (encouraging us to look at those screens, and not at the rest of reality). Cognitive film theory is apolitical; film-philosophy, meanwhile, engages in the very political dimensions of cinema. Eye tracking studies of film tend to position themselves as part of the former; my aim here is to bring them into dialogue with the latter.

If eye tracking studies of film tend to position themselves as part of a would-be apolitical approach to cinema, then in their investigation into cinema, they are nonetheless conducting an investigation into politics, as per Beller’s equation of cinema with politics highlighted above. However, while eye tracking studies of film position themselves as apolitical, politics do creep into eye tracking studies, especially through what I shall call their absences. What is more, these politics do relate to film-philosophy’s ‘political’ approach to film. In order to demonstrate this, I shall begin by analyzing how eye tracking studies of film have sought historically to legitimate themselves.

Early in an essay that gives an overview of eye tracking studies of film, Smith asserts, without naming any, that the hypotheses of film theory ‘generally remain untested’ (Smith 2013, 165). In this almost throwaway comment, we perhaps find important information. For in asserting that eye tracking is what can help us to ‘test’ out some theories of film, as Smith goes on to do in relation to Sergei M. Eisenstein’s writing about his own film, Alexander Nevsky (USSR, 1938), he perhaps overlooks how film theorists often (but perhaps not always) try (though not always with success) to construct their theories based on the films that they have seen, studied and perhaps made, and not the other way around. That is, Smith seems not to consider that watching films is itself a means of testing our theories about films – without the need for eye tracking devices. On a related note, while he does consider filmmakers like Eisenstein, D.W. Griffith, Edward Dmytryk and others as ‘experimentalists’ of sorts (who have tested their own theories), Smith also does not fully acknowledge that the history of cinema can itself be seen as a prolonged ‘test’ in what ‘works’ or ‘does not work’ with audiences – with that which ‘works’ being regularly adopted as either a short- or a long-term strategy by the film industry, be that in terms of re-using storylines, adopting a specific cinematic style, employing bankable film stars, using topical settings, engaging with zeitgeist themes and so on. Instead, it is Smith’s intervention that will validate or otherwise that history of theory and practice, and which will confirm what filmmakers, and perhaps also many audience members, have probably known for a long time, even if putting their knowledge into practice sometimes proves harder than we might imagine (because otherwise films would presumably not have ‘mistakes’ in them).

Now, it’s natural that a (relatively) new approach to studying film would need to legitimize itself in order to gain credibility and following – and Smith clearly charts the c30 year trajectory of eye tracking in film studies since the 1980s onwards (Smith 2014: 90). Nonetheless, if the history of cinema is not ‘test’ enough for Smith, then implicitly a claim is being made here about what constitutes a ‘real’ test, and, by extension, what sort of person can carry out a ‘real’ test. In other words, eye tracking, and the cognitive framework more generally, here legitimizes itself as being a tool for verifying (scientifically) what previously were ‘mere’ and speculative theories (these are my terms) – with the people qualified to carry out these tests being neither filmmakers nor audience members, but psychologists. By justifying eye tracking in this way, Smith is not just making a statement of fact (eye tracking demonstrates that viewers look at the same things at the same time during films made using the continuity editing style), but he is also – I assume unintentionally – making an implicit value judgment that carries political assumptions regarding what constitutes a/the most legitimate framework for learning and knowing about film. If, as per my anecdote above, I can and do know the same things via introspection that eye tracking tells me, then why is introspection not equally legitimate as a framework, even if the former involves less visible labor, and certainly less sexy imagery, and thus does not seem to involve any real ‘testing’?

Eye tracking thus seeks ‘politically’ to legitimate itself as a tool for film analysis. To be clear: eye tracking is legitimate, but it is also always already making claims about what constitutes knowledge: introspection is not knowledge, while science is – even if both can lead to the same understanding. Importantly, in producing visible evidence (the afore-mentioned ‘sexy imagery’ of colored clusters of eye-gaze on scenes from films), then eye tracking studies are also always already cinematic, by which I mean to say that they affirm a system whereby the visual/the cinematic (here are pictures of attention being captured) are validated above invisible (here, introspective) approaches to the same knowledge. This in turn always already affirms the process of cinema and attention-grabbing as being the (political) system that is most powerful.

If eye tracking affirms a politically cinematic world, in that cinematic forms of knowledge are more valid than invisible, i.e. uncinematic, ones, then within that cinematic world eye tracking might also, and in some respects implicitly does, legitimate some forms of cinema over others. This is suggested by the way in which eye tracking studies look predominantly at Hollywood/mainstream cinema in their analyses of film. For example, in his AToCC, Smith (2012a) cites a diverse range of movies, including L’escamotage d’une dame au théâtre Robert Houdin/The Vanishing Lady (Georges Méliès, France, 1896) and L’année dernière à Marienbad/Last Year at Marienbad (Alain Resnais, France/Italy, 1961), but eye tracking data are given mainly for contemporary Hollywood films, including Blade Runner (Ridley Scott, USA/Hong Kong/UK, 1982), Requiem for a Dream (Darren Aronofsky, USA, 2000) and There Will Be Blood (Paul Thomas Anderson, USA, 2007), with Smith suggesting that continuity editing is the form of cinema best suited to capturing attention.1

The absence of eye tracking data on those other, non-Hollywood films is perhaps telling, as suggested by two respondents to Smith’s essay, who query how his theories would apply to different cinemas, including the avant garde (Freeland 2012, 40-41) and, at least by implication, Japanese cinema (Rogers 2012, 47-48). Eye tracking would of course yield important insights into avant-garde and other forms of cinema, but that information is not offered here.

Furthermore, Smith’s suggestion that continuity editing is the form best suited to capturing attention, also prompts Paul Messaris and Greg M. Smith to argue that continuity editing violations, in particular jump cuts, are quite regular and not particularly detrimental to the continuity of the film viewing experience (Messaris 2012, 28-29; Greg M. Smith 2012, 57). Malcolm Turvey, meanwhile, argues that the film viewing experience is always continuous, meaning that the ‘continuity’ of continuity editing ‘is not continuity of viewer attention per se… but rather the manner in which films engage and manage that attention’ (Turvey 2012, 52-53; for Smith’s riposte to these responses and more, see Smith 2012b).

These responses highlight how filmmaking ‘perfection’ (an absence of continuity errors) need not be fetishized too much; audiences are quite happy to watch films with continuity errors (many of which they will not notice). Furthermore, many audiences love what Jeffrey Sconce (1995) might term ‘paracinema’ – i.e. ‘trash’ cinema and ‘bad’ movies – be they intentionally ‘bad’ or otherwise. In other words, it would seem that as long as audiences are primed regarding how they should receive a film (or, in Turvey’s language, as long as their attention is managed and then engaged in the right way), then you don’t need to care about and can even love the stylised acting, the ropey mise-en-scène, the unmotivated camera movements, the strange edits and the story loopholes of, say, The Room (Tommy Wiseau, USA, 2003), supposedly the worst film in history. Under the right circumstances (with the right management/ preparation), it would seem that audiences can like pretty much anything, including a 485-minute film of the Empire State Building (Empire, Andy Warhol, USA, 1964). In other words, while in his AToCC Smith mentions Méliès and Resnais, and while he engages with Eisenstein and other filmmakers elsewhere, the AToCC puts an emphasis on mainstream Hollywood cinema and its predominant system of continuity editing, since this cinema elicits a synchronicity of response, or control over attention, in that viewers attend to the same parts of the screen at the same time – while also often failing to detect edits done in the continuity editing style (see Smith and Henderson 2008). There is a seeming bias here towards mainstream, narrative filmmaking, the engrossing nature of which is lauded at the expense of other cinemas.

Let us move away from Smith in order to demonstrate how this bias is not his alone. Jennifer Treuting suggests that ‘[t]he use of eye tracking… can help filmmakers and other visual artists refine their craft’ (Treuting 2006: 31). In some respects, this is an innocent comment; I have no doubt that eye tracking can help filmmakers and other visual artists to refine their craft. But suggested in this ‘refinement’ is also the move towards validating the mainstream/continuity style at the expense of its alternatives. A combined eye tracking and fMRI study carried out by Uri Hasson and colleagues also makes this clear: much fuss is made over how work by Alfred Hitchcock elicits greater synchrony (‘inter-subject correlation’) in viewers than does an ‘unstructured’ shot of a concert in Washington Square Park, a film that is simply a ‘point of reference’ and which ‘fails to direct viewers’ gaze’ (Hasson et al. 2008, 13-14; emphasis added). My reference above to Warhol’s Empire here becomes apposite: what Hasson and colleagues dismiss as a ‘point of reference’ and as a ‘failure’ in various respects defines one of the great experimental films. Perhaps ‘marginal’ films like Empire should also be considered successful – but at achieving something different to the work of Hitchcock, and perhaps Hasson’s film is not a ‘point of reference’, but an experimental work that equally inhabits the totality of films in the world that we shall call cinema.

If Hitchcock ‘succeeds’ in controlling viewers’ attention, while Warhol by implication ‘fails’, then eye tracking becomes implicitly/inevitably embroiled in not just what film is, but in what film could or should be – as Treuting’s suggestion that eye tracking might feed back into filmmaking also makes clear. This suggests that there is a politics to eye tracking film studies, particularly in the UK where universities are increasingly relying on ‘impact’, particularly on the economy, in order to survive: they don’t just observe films, but feed back into how films are, or should be, made, by exploring what is ‘successful’ in terms of eliciting attention, getting bums on seats and thus making money. In some respects, eye tracking in particular and cognitive film theory in general are now dragged back towards the Marxist approach to cinema that cognitive film theory initially sought to reject: it, too, shapes/seeks to shape cinema just as Marxist film theory in effect lobbied for alternatives to the mainstream. However, where Marxist film theory lobbied for a rejection of mainstream cinematic techniques, eye tracking studies seem to validate them – and to suggest that filmmakers might ‘refine their craft’ by adopting/intensifying them. Saving the thorny issue of ‘control’ and ‘influence’ for later, there is still a political dimension to this potential validation of mainstream cinema techniques, because it reaffirms the economic hegemony of one style over others and it also validates in some degree a homogeneity of product (and of audience?) – all within a ‘cinematic’ economic system that is itself predicated upon gaining attention. Cinema is both business and art, but if art is one thing it is unique/different, and so a move towards homogeneity is a move towards the reduction of art in favor of business. If it requires an artist rather than an academic to make this clear, then Darren Aronofsky’s apprehensive response to Hasson’s work at the AMPAS events hopefully serves this purpose: ‘“It’s a scary tool for the studios to have,” Aronofsky said. “Soon they’ll do test screenings with people in MRIs.” The audience laughed, but it didn’t seem like he was joking, at least not entirely’ (Miller 2014b).

I have so far argued that cinema is political, that eye tracking studies have required some political maneuvering in order to legitimate themselves, and that the focus on continuity editing/mainstream cinema by eye tracking studies may also have a political dimension. However, are eye tracking studies themselves without methodological politics, in that they simply report findings? I wish presently to suggest that eye tracking research does have methodological limitations – which is why I asserted above that eye tracking film studies are only to a large extent and not entirely reliable – and that these limitations also have a political dimension. The methodological limitations are not simply a case of potential inaccuracies regarding the type of eye-tracker used, determining how long the eye needs to be still for a fixation to take place, what algorithm is used to measure this, or how accurate is the eye-tracker in determining where exactly the eye is looking – all ongoing issues with eye tracking technologies (see, inter alia, Wass et al. 2013; Saez de Urabain et al. 2014). It is also a case of issues of statistical significance and the politics thereof, particularly what I shall call the temporal politics, and to a lesser extent the social politics, of eye tracking. In relation to the latter, many eye tracking studies involve students in order to carry out their research (e.g. Tatler et al. 2010; Võ et al. 2012). As a result, the findings might pertain not universally, but to population members who are of a certain age and, if we can say that university students tend to be from more affluent backgrounds, a certain socioeconomic status. In relation to statistical significance, meanwhile, all studies tend to discount those viewers who do not look where the researchers want them to look; for example, in a study of where people look when viewing moving faces, only 87 per cent of fixations targeted the face region when shown a moving face with sound, with that figure dropping to 82 per cent when shown a moving face without sound (Võ et al. 2012, 7). Of course, when what one is investigating is where people look when they look at faces, it is correct to discount those 13-18 per cent of fixations that were not directed at the face. But the point is that similar discounts happen all the time, not least in the process of averaging that we see in various experiments, including those mentioned by Marchant et al., Hasson et al., and Smith. And yet, where neuroscience is based in large part upon the study of anomalous brains – from autists to damage sufferers to perceived geniuses – psychologists engaged in eye tracking tend to go with force majeure and report the average, or what most people do. There may however be in human populations a ‘long tail’ (to use the terminology of Chris Anderson, 2006) that may not in any one experiment be statistically significant, but which over a number of experiments might begin to show patterns that could help us to understand vision and attention in a more ‘holistic’ fashion.

To continue by way of another anecdote: a film scholar took part in an eye tracking film study at a leading European university. Upon completion, the colleague conducting the study told the scholar that they looked in completely different places – generally at the margins of the screen – to where most of the other participants looked, and that their participation was therefore useless to the study. If we can say that the film scholar looked (perhaps deliberately) where others do not look, then to what degree is film viewing a matter of, to use Turvey’s language, management and engagement? That is, do film scholars look differently at films, perhaps even at the world? And if so, what can we make of this?

The Russian ‘godfather’ of eye tracking studies, Alfred Yarbus. famously published in the 1960s that setting viewers different tasks will completely modify where they look at an image (Yarbus 1967; see also Tatler et al. 2010). There is much to extrapolate from this. For while eye tracking studies will use terms like ‘naïve’ to define how participants are unaware of the aims of the study, when it comes to film viewing, humans are rarely naïve at all. Advertising, reviews and other publicity materials are always – at least on an implicit level – telling us how and where to look at films, just as the media and our conspecifics are telling us how and where to look in the real world. Now, it may well be that humans who have never before seen a movie have little trouble understanding Hollywood cinema, as affirmed, inter alia, by both Messaris (2012, 31-33) and Smith (2012b, 74). Nonetheless, our attention is not just managed and engaged in the cinema, but it is also managed and engaged for the cinema, and I have not read any studies where psychologists showed a non-Hollywood film to first-time audiences and in which those audiences had trouble understanding the film; that is, these studies affirm nothing about the comprehension of continuity editing per se, although they might affirm that humans can understand cinema without training – as is presumably affirmed worldwide everyday as the first film shown to children is not a Hollywood film but a Bollywood, Nollywood, Filipino, Chinese or other movie; what is more, the studies perhaps only affirm the cultural hegemony enjoyed by Hollywood, in that psychologists present a Hollywood and not another film to those first-time viewers – and then use that research to affirm Hollywood’s economic primacy as being a result of its filmmaking style and not also as a result of historical and other factors. As Cynthia Freeland reminds us in her response to Smith’s AToCC, James Peterson in Post Theory argued that

a common feature of avant-garde film viewing – one that usually passes without comment: viewers initially have difficulty comprehending avant-garde films, but they learn to make sense of them. Students who take my course in the avant-garde cinema are at first completely confused by the films I show; by the end of term, they can speak intelligently about the films they see. (Peterson 1996, 110; quoted in Freeland 2012, 41)

In other words, as per my assertions re: The Room above, it is quite possible that humans would quite easily watch – and enjoy – all manner of different films, but that they do not because their attention is not ‘managed and engaged’. Again, this is a political issue, because if it is true, then it is about who can afford to use the mass media to manage and engage the attention of the most people in the quest for profit – meaning that alternative approaches to filmmaking are forced either to adopt the same system of filmmaking to compete, or they are pushed to the margins where the struggle to find audiences – because people are not prepped to watch them. The scholar at the European university has had a long education in film, and this potentially helps to manage and engage differently how they attend to them; their ‘statistically insignificant’ response might well be important in helping to demonstrate how we can not just view different/marginal films, but also view mainstream films differently.

Cutting and colleagues suggest that film editing correlates with a 1/f pattern, with 1/f (one over frequency) referring to the ‘natural’ amount of time that humans attend to objects in the real world (Cutting et al. 2010). In other words, the suggestion is that Hollywood editing rhythms reflect human attention spans – ‘evolving toward 1/f spectra… [meaning that] the mind can be “lost”… most easily in a temporal art form with that structure’ (Cutting et al. 2010, 7). Now, since David L Gilden only came up with the 1/f structure in 1995 (Gilden et al. 1995), it remains untested, and untestable without a time machine, as to whether the human attention span itself changes over time, or according to culture. That said, if cinema has always been going at about the pace that human attention was working, and if cinema cutting rates have accelerated since the 1930s and through to the present era, then attention spans may well interact with culture, and even be shaped by our media.

I often ask my students how long they should look at a painting for. It’s a trick question, because of course there is no right or wrong answer. It is my (untested) hypothesis, however, that the amount of time humans look at paintings has been shaped by the media, including films; that is, in galleries, I see people look at paintings for about the average duration of a film shot (four to five seconds) – although recently they have begun to look at a painting for about the amount of time that it takes them to take a photo of that painting with their mobile handheld device.2 Smith, citing Cutting’s work, suggests that

[i]n an average movie theatre with a 40-foot screen viewed at a distance of 35 feet, this region at the centre of our gaze will only cover about 0.19 per cent of the total screen area. Given that the average shot length of most films produced today is less than 4 seconds… viewers will only be able to make at most 20 fixations covering only 3.8 per cent of the screen area. (Smith 2013: 168)

Given that paintings vary in size, one cannot rightly say how long it would take to see a ‘whole’ painting. But if one looks at a cinema-screen sized painting for 4 seconds, then one would, after Smith, fixate on about 4 per cent of that painting. In order to see the whole painting, then more time is needed, just as more time is needed to take in our natural, rather than cinematic, environment, since we also only ever see a small proportion of that at any one time.

Relating to film the foregoing foray into painting, we might add that, given that we do not take in visual information while saccading, and given that saccades have a duration of 20-50 miliseconds (Smith 2013, 168), this means that we do not take in visual information for 0.7 seconds during every four-second shot. At 90 minutes in length, there are on average 1,350 shots per film, meaning that we do not take in visual information for 15 minutes and 45 seconds per film – blinks and turning away from the screen for snogging and toilet breaks not included. If spatially we only see 3.8 per cent of the screen during a shot, and if we only see 82.5 per cent of a film’s duration, this means that we see around 3.14 per cent of the average (Hollywood) film (no spooky π references intended).3 To be clear, these statistics apply not just to Hollywood: I would only see 3.14 per cent of Empire if I were to watch it at the cinema, too. But since it is a film comprised of a single-seeming shot and a static frame, Empire clearly encourages viewers to look for longer at the space within the frame, while Hollywood arguably does not give viewers the time to do so, since the content and duration of images is concerned uniquely with story-telling, and not with anything else. This in turn affects for how long we think that we are supposed to look at objects in our everyday lives, if for the sake of argument my gallery hypothesis be allowed to stand. Neither paintings, nor Empire, nor the world itself is organized to be seen ‘cinematically’, even if Empire is undoubtedly a work of cinema. That is, they all invite contemplation, but what they often receive is a shot-length of attention before they become boring (Empire perhaps deliberately so). Neither paintings, nor Empire, nor large swathes of the world itself controls our attention in the way cinema does; there would be much more idiosyncrasy and less synchrony of attention when looking at Empire than at a mainstream film. If the proliferation of screens featuring cinematic techniques is the making-cinematic of reality in the services of capital, then the refusal to attend to paintings, Empire and the world itself suggests not just that our attention is controlled while watching a film, but that our attention is working at a ‘cinematic’ rhythm – a rhythm that Empire uses the very apparatus of cinema in order to try to break.

The ‘temporal politics’ that I mentioned above, then, is to do with the management and engagement of attention rhythms/patterns not just in cinema, and not just for cinema (we are prepped to be movie viewers), but also by cinema for the world (people pay attention to paintings in galleries about as long as they would attend to a film shot/as long as a film shot would allow them to attend to it, before ‘cutting’, or turning away, likely getting out one’s phone, the screen of which one can also cut across with the swipe of a thumb). Politics rear their head again as homogeneity of attention span, perhaps even of life rhythm, jump into bed with the political and economic concerns that govern the structures of our society. Almost certainly in an unwitting fashion (this is not a conspiracy), validating certain cutting rates and attentions spans over others becomes an issue linked to social control, and the economic bottom line of both cinema and perhaps society as a whole. Eye tracking studies of film are part of this political ecology.

A final throw of the dice. Those of us engaged in education are of course part of a system that prepares our students for the real world. But I am personally also committed to encouraging my students sociably and communicatively to develop their individuality, to become ‘idiosyncratic’, to look at the world differently and various other notions that have long since been corporatized disingenuously as advertising slogans. Being a film teacher, I do this through encouraging my students to look differently at films. Hollywood films employ techniques that do not encourage us to look differently at movies; instead, our attention (and our brain activity) are synchronized. What is more, the idiosyncratic viewers that do look at films differently (the European film scholar) are discounted from eye tracking studies for not conforming to the norm (for not confirming to us what we already know, even if not through a scientific framework). Not only might we encourage our students to look at the world differently (to become the idiosyncratic, perhaps ‘educated’ viewer), but we might also encourage our students to make films differently, since films can also play a role in encouraging us to see the world differently, to become ‘idiosyncratic’ individuals (Hasson’s research involved the production of an interesting avant garde work, regardless of his own thoughts on the matter). Perhaps eye tracking (and fMRI) studies can help in this by turning their attention not to the majority, but to the minority, to the marginal people who look, both figuratively and literally, at the margins of the screen, and at marginal films. And this perhaps involves slowing attention down, and making it (willfully?) deeper rather than rapid and superficial. I know that the longer I look at a painting, the more the power of its creation comes to my mind, the more I marvel at it and also at the world that sustains it. In other words, it brings me joy. As I repeat often to those students who do not seem committed to participating in my classes: the more you put in, the more you get out.

Would to educate (to manage and engage attention) both in the classroom and through making and showing different sorts of (slower?) films not simply replace one trend with another, and itself be prey to political issues regarding what type of ‘idiosyncrasy’ is best? Of course, such questions are going to be of ongoing importance and would need constant attention. In relation to eye tracking film studies, though, the introduction of a ‘temporal’ dimension might help enrich our understanding of idiosyncrasy. The spatial information that idiosyncratic eye-tracks give to us is chaotic and without pattern – and thus of not much use to the psychologist; however, there may well be temporal patterns that emerge when we consider ‘idiosyncrasy’ as a shared process (to be encouraged?), rather than as a reified thing to be commoditized.

Paul Willemen has written about cinephilia as being the search for/paying attention to otherwise overlooked details in movies (Willemen 1994, 223-57). Meanwhile, Laura Mulvey has argued that DVD technology allows the film viewer to develop a deeper, cinephilic relationship with movies, since she can now pause and really analyse a film – by ‘delaying’ it/slowing it down (Mulvey 2006, 144-60). To look idiosyncratically at a movie is thus to look ‘cinephilically’; it is to look at cinema with love, perhaps to look with love tout court – but in this instance at cinema. My argument comes full circle, then, as we bring cognitive film theory, via eye tracking film studies, into contact with film theory/film-philosophy, exemplified here by Mulvey as a major figure from the Screen movement/moment. There is no I in eye tracking – but if we can accept that eye tracking studies of cinema are embroiled in a political discourse (and a political reality) concerning which films are validated as better than others and why, then perhaps by putting an ‘I’ into eye tracking, by looking at the idiosyncratic in addition to the statistically significant, then we may be able to bring about different ways of seeing and making films.


  1. The exception is Dancer in the Dark (Lars von Trier, Spain/Argentina/ Denmark/Germany/Netherlands/Italy/USA/UK/France/Sweden/Finland/ Iceland/Norway, 2000).
  2. One of my peer reviewers took issue with the speculative nature of this suggestion. The other agreed with it.
  3. Note that I insist on the term ‘visual information’ – since film does not just engage us visually, but also aurally and via other senses (as Freeland, 2012, also reminds Smith in her response to his AToCC essay).



William Brown is a Senior Lecturer in Film at the University of Roehampton, London. He is the author of Supercinema: Film-Philosophy for the Digital Age (Berghahn, 2013) and, with Dina Iordanova and Leshu Torchin, of Moving People, Moving Images: Cinema and Trafficking in the New Europe (St Andrews Film Studies, 2010). He is the co-editor, with David Martin-Jones, of Deleuze and Film (Edinburgh University Press, 2012). He is also a filmmaker.

From Subtitles to SMS: Eye Tracking, Texting and Sherlock – Tessa Dwyer


As we progress into the digital age, text is experiencing a resurgence and reshaping as blogging, tweeting and phone messaging establish new textual forms and frameworks. At the same time, an intrusive layer of text, obviously added in post, has started to feature on mainstream screen media – from the running subtitles of TV news broadcasts to the creative portrayals of mobile phone texting on film and TV dramas. In this paper, I examine the free-floating text used in BBC series Sherlock (2010–). While commentators laud this series for the novel way it integrates text into its narrative, aesthetic and characterisation, it requires eye tracking to unpack the cognitive implications involved. Through recourse to eye tracking data on image and textual processing, I revisit distinctions between reading and viewing, attraction and distraction, while addressing a range of issues relating to eye bias, media access and multimodal redundancy effects.

Figure 1

Figure 1: Press conference in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.


BBC’s Sherlock (2010–) has received considerable acclaim for its creative deployment of text to convey thought processes and, most notably, to depict mobile phone messaging. Receiving high-profile write-ups in The Wall Street Journal (Dodes, 2013) and Wired UK, this innovative representational strategy has been hailed an incisive reflection of our current “transhuman” reality and “a core element of the series’ identity” (McMillan 2014).[1] In the following discussion, I deploy eye tracking data to develop an alternate perspective on this phenomenon. While Sherlock’s on-screen text directly engages with the emerging modalities of digital and online technologies, it also borrows from more conventional textual tools like subtitling and captioning or SDH (subtitling for the deaf and hard-of-hearing). Most emphatically, the presence of floating text in Sherlock challenges the presumption that screen media is made to be viewed, not read. To explore this challenge in detail, I bring Sherlock’s inventive titling into contact with eye tracking research on subtitle processing, using insights from audiovisual translation (AVT) studies to investigate the complexities involved in processing dynamic text on moving-image screens. Bridging screen and translation studies via eye tracking, I consider recent on-screen text developments in relation to issues of media access and linguistic diversity, noting the gaps or blind spots that regularly infiltrate research frameworks. Discussion focuses on ‘A Study in Pink’ – the first episode of Sherlock’s initial season – which producer Sue Vertue explains was actually “written and shot last, and so could make the best use of onscreen text as additional script and plot points” (qtd in McMillan, 2014).

Texting Sherlock

Figure 2

Figure 2: Watson reads a text message in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

The phenomenon under investigation in this article is by no means easy to define. Already it has inspired neologisms, word mashes and acronyms including TELOP (television optical projection), ‘impact captioning’ (Sasamoto, 2014), ‘decotitles’ (Kofoed, 2011), ‘beyond screen text messaging’ (Zhang 2014) and ‘authorial titling’ (Pérez González, 2012). While slight differences in meaning separate such terms from one another, the on-screen text in Sherlock fits all. Hence, in this discussion, I alternate between them and often default to more general terms like ‘titling’ and ‘on-screen text’ for their wide applicability across viewing devices and subject matter. This approach preserves the terminological ambiguity that attaches to this phenomenon instead of seeking to solve it, finding it symptomatic of the rapid rate of technological development with which it engages. Whatever term is decided upon today could well be obsolete tomorrow. Additionally, as Rick Altman (2004: 16) notes in his ‘crisis historiography’ of silent and early sound film, the “apparently innocuous process of naming is actually one of culture’s most powerful forms of appropriation.” He argues that in the context of new technologies and the representational codes they engender, terminological variance and confusion signals an identity crisis “reflected in every aspect of the new technology’s socially defined existence” (19).

According to the write-ups, phone messaging is the hero of BBC’s updated and rebooted Sherlock adaptation. Almost all the press garnered around Sherlock’s on-screen text links this strategy to mobile phone ‘texting’ or SMS (short messaging service). Reporting on “the storytelling challenges of a world filled with unglamorous smartphones, texting and social media”, The Wall Street Journal’s Rachel Dodes (2013) credits Sherlock with solving this dilemma and establishing a new convention for depicting texting on the big screen, creatively capturing “the real world’s digital transformation of everyday life.” For Mariel Calloway (2013), “Sherlock is honest about the role of technology and social media in daily life and daily thought… the seamless way that text messages and internet searches integrate into our lives.” Wired’s Graeme McMillan (2014) ups the ante, naming Sherlock a “new take” on “television drama as a whole” due precisely to its on-screen texting technique that sets it apart from other “tech-savvy shows out there”. McMillan continues, that “as with so many aspects of Sherlock, there’s an element of misdirection going on here, with the fun, eye-catching slickness of the visualization distracting from a deeper commentary the show is making about its characters relationship with technology – and, by extension, our own relationship with it, as well.”

As this flurry of media attention makes clear, praise for Sherlock’s on-screen text or texting firmly anchors this strategy to technology and its newly evolving forms, most notably the iPhone or smartphone. Appearing consistently throughout the series’ three seasons to date, on-screen text in Sherlock occurs in a plain, uniform white sans-serif font that appears unadorned over the screen image, obviously added during post-production. This text is superimposed, pure and simple, relying on neither text bubbles nor coloured boxes nor sender ID’s to formally separate it from the rest of the image area. As Michele Tepper (2011) eloquently notes, by utilising text in this way, Sherlock “is capturing the viewer’s screen as part of the narrative itself”:

It’s a remarkably elegant solution from director Paul McGuigan. And it works because we, the viewing audience, have been trained to understand it by the last several years of service-driven, multi-platform, multi-screen applications. Last week’s iCloud announcement is just the latest iteration of what can happen when your data is in the cloud and can be accessed by a wide range of smart-enough devices. Your VOIP phone can show caller ID on your TV; your iPod can talk to both your car and your sneakers; Twitter is equally accessible via SMS or a desktop application. It doesn’t matter where or what the screen is, as long as it’s connected to a network device. … In this technological environment, the visual conceit that Sherlock’s text message could migrate from John Watson’s screen to ours makes complete and utter sense.

Unlike on-screen text in Glee (Fox, 2009–), for instance (see Fig. 3), that is used only occasionally in episodes like ‘Feud’ (Season 4, Ep 16, March 14, 2013), Sherlock flaunts its on-screen text as signature. Its consistently interesting textual play helps to give the series cohesion. Yet, just as it aids in characterisation, helps to progress the narrative, and binds the series as a whole, it also, necessarily, remains at somewhat of a remove, as an overtly post-production effect.

Figure 3

Figure 3: Ryder chats online in ‘Feud’, Glee (2013), Episode 16, Season 4.

While Tepper (2011) explains how Sherlock’s “disembodied” (Banks, 2014) texting ‘makes sense’ in the age of cross-platform devices and online clouds, this argument falters when the on-screen text in question is less overtly technological. The extradiegetic nature of this on-screen text – so obviously a ‘post’ effect – is brought to the fore when it is used to render thoughts and emotions rather than technological interfacing. In ‘A Study in Pink’, a large proportion of the text that pops up intermittently on-screen functions to represent Sherlock’s interiority, not his Internet prowess. In concert with camera angles and “microscopic close-ups”, it elucidates Sherlock’s forensic “mind’s eye” (Redmond, Sita and Vincs, this issue), highlighting clues and literally spelling out their significance (see Figs. 4 and 5). The fact that these human-coded moments of titling have received far less attention in the press than those that more directly index new technologies is fascinating in itself, revealing the degree to which praise for Sherlock’s on-screen text is invested in ideas of newness and technological innovation – underlined by the predilection for neologisms.

Figure 4

Figures 4: Sherlock examines the pink lady’s ring in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Figure 5

Figures 5: Sherlock examines the pink lady’s ring in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Of course, even when not attached to smartphones or data retrieval, Sherlock’s deployment of on-screen text remains fresh, creative and playful and still signals perceptual shifts resulting from technological transformation. Even when representing Sherlock’s thoughts, text flashes on screen manage to recall the excesses of the digital, when email, Facebook and Twitter ensconce us in streams of endlessly circulating words, and textual pop-ups are ubiquitous. Nevertheless, the blinkered way in which Sherlock’s on-screen text is repeatedly framed as, above all, a means of representing mobile phone texting functions to conceal some of its links to older, more conventional forms of titling and textual intervention, from silent-era intertitles to expository titles to subtitles. By relentlessly emphasising its newness, much discussion of Sherlock’s on-screen text overlooks links to a host of related past and present practices. Moreover, Sherlock’s textual play actually invites a rethinking of these older, ongoing text-on-screen devices.

Reading, Watching, Listening

As Szarkowska and Kruger (this issue) explain, research into subtitle processing builds upon earlier eye tracking studies on the reading of static, printed text. They proceed to detail differences between subtitle and ‘regular’ reading, in relation to factors like presentation speed, information redundancy, and sensory competition between different multimodal channels. Here, I focus on differences between saccadic or scanning movements and fixations, in order to compare data across the screen and translation fields. During ‘regular’ reading (of static texts) average saccades last 20 to 50 milliseconds (ms) while fixations range between 100 and 500ms, averaging 200 to 300ms (Rayner, 1998). Referencing pioneering studies into subtitle processing by Géry d’Ydewalle and associates, Szarkowska et al. (2013: 155) note that “when reading film subtitles, as opposed to print, viewers tend to make more regressions” and fixations tend to be shorter. Regressions occur when the eye returns to material that has already been read, and Rayner (1998: 393) finds that slower readers (of static text) make more regressions than faster readers. A study by d’Ydewalle and de Bruycker (2007: 202) found “the percentage of regressions in reading subtitles was globally, among children and adults, much higher than in normal text reading.” They also report that mean fixation durations in the subtitles was shorter, at 178 ms (for adults) and explain that subtitle regressions (where the eye travels back across words already read) can be partly explained by the “considerable information redundancy” that occurs when “[s]ubtitle, soundtrack (including the voice and additional information such as intonation, background noise, etc.), and image all provide partially overlapping information, eliciting back and forth shifts with the image and more regressive eye-movements” (202).

What happens to saccades and fixations when image processing is brought into the mix? When looking at static images, average fixations last 330 ms (Rayner, 1998). This figure is slightly longer than average fixations during regular reading and longer again than average subtitle fixations. Szarkowska and Kruger (this issue) note that “reading requires many successive fixations to extract information whereas looking at a scene requires fewer, but longer fixations” that tend to be more exploratory or ambient in nature, taking in a greater area of focus. In relation to moving-images, Smith (2013: 168) finds that viewers take in roughly 3.8% of the total screen area during an average length shot. Peripheral processing is at play but “is mostly reserved for selecting future saccade targets, tracking moving targets, and extracting gist about scene category, layout and vague object information”. In thinking about these differences in regular reading behaviour, screen viewing, and subtitle processing, it is noticeable that with subtitles, distinctions between fixations and saccades are less clear-cut. While saccades last between 20 and 50ms, Smith (2013: 169) notes that the smallest amount of time taken to perform a saccadic eye movement (taking into account saccadic reaction time) is 100-130ms. Recalling d’Ydewalle and de Bruycker’s (2007: 202) finding that fixations during subtitle processing last around 178ms, it would seem that subtitle conditions blur the boundaries somewhat between saccades and fixations, scanning and reading.

Interestingly, studies have also shown that the processing of two-line subtitles involves more regular word-by-word reading than for one-liners (D’Ydewalle and de Bruycker, 2007: 199). D’Ydewalle and de Bruycker (2007: 199) report, for instance, that more words are skipped and more regressions occur for one-line subtitles than for two-line subtitles. Two-line subtitles result in a larger proportion of time being spent in the subtitle area, and occasion more back-and-forth shifts between the subtitles and the remaining image area (201). This finding suggests that the processing of one-line subtitles differs considerably from regular reading behaviour. D’Ydewalle and de Bruycker (2007: 202) surmise that the distinct way in which one-line subtitles are processed relates to a redundancy effect caused by the multimodal nature of screen media. Noting how one-line subtitles often convey short exclamations and outcries, they suggest that a “standard one-line subtitle generally does not provide much more information than what can already be extracted from the picture and the auditory message.” They conclude that one-line subtitles occasion “less reading” than two-line subtitles (202). Extrapolating further, I posit that the routine overlapping of information that occurs in subtitled screen media blurs lines between reading and watching. One-line subtitles are ‘read’ irregularly and partly blind – that is, they are regularly skipped and processed through saccadic eye movements rather than fixations.

This suggestion is supported by data on subtitle skipping. Szarkowska and Kruger (this issue) find that longer subtitles containing frequently used words are easier and quicker to process than shorter subtitles containing low-frequency words. Hence, they conclude that cognitive load relates more to word familiarity than quantity, something that is overlooked in many professional subtitling guidelines. This finding indicates that high-frequency words are processed ‘differently’ in subtitling than in static text, in a manner more akin to visual recognition or scanning than reading. Szarkowska and Kruger find that high-frequency words in subtitles are often skipped. Hence, as with one-line subtitles, high-frequency words are, to a degree, processed blind, possibly through shape recognition and mapping more than durational focus. In relation to other types of on-screen text, such as the short, free-floating type that characterises Sherlock, it seems entirely possible that this innovative mode of titling may just challenge distinctions between text and image processing. While commentators laud this series for the way it integrates on-screen text into its narrative, style and characterisation, eye tracking is required to unpack the cognitive implications of Sherlock’s text/image morph.

The Pink Lady

Figure 6

Figure 6: Letters scratched into the floor in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Sherlock producer Vertue refers to the pink lady scene in ‘A Study in Pink’ as particularly noteworthy for its “text all around the screen”, referring to it as the “best use” of on-screen text in the series (qtd in McMillan, 2014). In this scene, a dead woman dressed in pink lies face first on the floor of a derelict building into which she has painstakingly etched a word or series of letters (‘Rache’) with her fingernails. As Sherlock investigates the crime scene, forensics officer Anderson interrupts to explain that ‘Rache’ is the German word for ‘revenge’. The German-to-English translation pops up on screen (see Fig. 6), and this time Sherlock sees it too. This superimposed text, so obviously laid over the image, oversteps its surface positioning to enter Sherlock’s diegetic space, and we next view it backwards, from Sherlock’s point of view, not ours (see Fig. 7). After an exasperated eye roll that signals his disregard for Anderson, Sherlock dismisses this textual intervention and we watch it swirl into oblivion. Here, on-screen text is at once both inside and outside the narrative, diegetic and extra-diegetic, informative and affecting. In this way it self-reflexively draws attention to the show’s narrative framing, demonstrating its complexity as distinct diegetic levels merge.

Figure 7

Figure 7: Sherlock sees on-screen text in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

For Carol O’Sullivan (2011), when on-screen text affords this type of play between the diegetic and extra-diegetic it functions as an “extreme anti-naturalistic device” (166) that she unpacks via Gérard Genette’s notion of narrative metalepsis (164). Detailing numerous examples of humourous, formally transgressive diegetic subtitles, such as those found in Annie Hall (Woody Allen, 1977) (Fig. 8), O’Sullivan points to their metatextual function, referring to them as “metasubtitles” (166) that implicitly comment on the limits and nature of subtitling itself. When Sherlock’s on-screen titles oscillate between character and viewer point-of-view shots, they too become metatextual, demonstrating, in Genette’s terms, “the importance of the boundary they tax their ingenuity to overstep in defiance of verisimilitude – a boundary that is precisely the narrating (or the performance) itself: a shifting but sacred frontier between two worlds, the world in which one tells, the world of which one tells” (qtd in O’Sullivan 2011: 165). Moreover, for O’Sullivan, “all subtitles are metatextual” (166) necessarily foregrounding their own act of mediation and interpretation. Specifically linking such ideas to Sherlock, Luis Perez Gonzalez (2012: 18) notes how “the series creators incorporate titles that draw attention to the material apparatus of filmic production”, thereby creating an complex alienation-attraction effect “that shapes audience engagement by commenting upon the diegetic action and disrupting conventional forms of semiotic representation, making viewers consciously work as co-creators of media content.”

Figure 8

Figure 8: Subtitled thoughts in the balcony scene, Annie Hall (1977).

Eye Bias

One finding from subtitle eye tracking research particularly pertinent to Sherlock is the notion that on-screen text causes eye bias. This was established in various studies conducted by d’Ydewalle and associates, which found that subtitle processing is largely automatic and obligatory. D’Ydewalle and de Bruycker (2007: 196) state:

Paying attention to the subtitle at its presentation onset is more or less obligatory and is unaffected by major contextual factors such as the availability of the soundtrack, knowledge of the foreign language in the soundtrack, and important episodic characteristics of actions in the movie: Switching attention from the visual image to “reading” the subtitles happens effortlessly and almost automatically (196).

This point is confirmed by Bisson et al. (2014: 399) who report that participants read subtitles even in ‘reversed’ conditions – that is, when subtitles are rendered in an unfamiliar language and the screen audio is fully comprehensible (in the viewers’ first language) (413). Again, in intralingual or same-language subtitling – when titles replicate the language spoken on screen –hearing audiences still divert to the subtitle area (413). These findings indicate that viewers track subtitles irrespective of language or accessibility requirements. In fact, the tracking of subtitles overrides function. As Bisson et al. (413) surmise, “the dynamic nature of the subtitles, i.e., the appearance and disappearance of the subtitles on the screen, coupled with the fact that the subtitles contained words was enough to generate reading behavior”.

Szarkowska and Kruger (this issue) reach a similar conclusion, explaining eye bias towards subtitles in terms of both bottom-up and top-down impulses. When subtitles or other forms of text flash up on screen, they affect a change to the scene that automatically pulls our eyes. The appearance and disappearance of text on screen is registered in terms of motion contrast, which according to Smith (2013: 176), is the “critical component predicting gaze behavior”, attaching to small movements as well as big. Additionally, we are drawn to words on screen because we identify them as a ready source of relevant information, as found in Batty et al. (forthcoming). Analysing a dialogue-free montage sequence from animated feature Up (Pete Docter, 2009), Batty et al. found that on-screen text in the form of signage replicates in miniature how ‘classical’ montage functions as a condensed form of storytelling aiming for enhanced communication and exposition. They suggest that montage offers a rhetorical amplification of an implicit intertitle, thereby alluding to the historical roots of text on screen while underlining its narrative as well as visual salience. One frame from the montage sequence focuses in close-up on a basket containing picnic items and airline tickets (see Fig. 9). Eye tracking tests conducted on twelve participants indicates a high degree of attentional synchrony in relation to the text elements of the airline ticket on which Ellie’s name is printed. Here, text provides a highly expedient visual clue as to the narrative significance of the scene and viewers are drawn to it precisely for its intertitle-like, expository function, highlighting the top-down impulse also at play in the eye bias caused by on-screen text.

Figure 9

Figure 9: Heat map showing collective gaze weightings during the montage sequence in Up (2009).

In this image from Up, printed text appears in the centre of the frame and, as Smith (2013: 178) elucidates, eyes are instinctively drawn towards frame centre, a finding backed up by much subtitle research (see Skarkowska and Kruger, this issue). However, eye tracking results on Sherlock conducted by Redmond, Sita and Vincs (this issue) indicate that viewers also scan static text when it is not in the centre of the frame. In an establishing shot of 221B Baker Street from the first episode of Sherlock’s second season, ‘A Scandal in Belgravia’, viewers track static text that borders the frame across its top and right hand sides, again searching for information (See Fig. 10). Hence, the eye-pull exerted by text is noticeable even in the absence of movement, contrast and central framing. In part, viewers are attracted to text simply because it is text – identified as an efficient communication mode that facilitates speedy comprehension (see Lavaur, 2011: 457).

Figure 10

Figure 10: Single viewer gaze path for ‘A Scandal in Belgravia’, Sherlock (2012), Episode 1, Season 2.


What do these eye tracking results across screen and translation studies tell us about Sherlock’s innovative use of on-screen text and texting? Based on the notion that text on screen draws the eye in at least dual ways, due to both its dynamic/contrastive nature and its communicative expediency, we can surmise that for Sherlock viewers, on-screen text is highly visible and more than likely to be in that 3.8% of the screen on which they will focus at any one point in time (see Smith, 2013: 168). The marked eye bias caused by text on screen is further accentuated in Sherlock by the freshness of its textual flashes, especially for English-speaking audiences given the language hierarchies of global screen media (see Acland 2012, UNESCO 2013). The small percentage of foreign-language media imported into most English-speaking markets tends to result in a lack of familiarity with subtitling beyond niche audience segments. For those unfamiliar with subtitling or captioning, on-screen text appears particularly novel. Additionally, as explored, floating TELOPs in Sherlock attract attention due to the complex functions they fulfil, providing narrative and character clues as well as textual and stylistic cohesion. As Tepper (2011) points out, in the first episode of the series, viewers are introduced to Sherlock’s character via text, before seeing him on screen. “When he texts the word ‘Wrong!’ to DI Lestrade and all the reporters at Lestrade’s press conference,” notes Tepper, “the technological savvy and the imperiousness of tone tell you most of what you need to know about the character.”

There seems no doubt that on-screen text in Sherlock attracts eye movement, and that it therefore distracts from other parts of the image. One question then that immediately presents itself is why Sherlock’s textual distractions are tolerated – even celebrated – to a far greater extent than other, more conventional or routine forms of titling like subtitles and captions. While Sherlock’s on-screen text is praised as innovative and incisive, interlingual subtitling and SDH are criticised by detractors for the way in which they supposedly force viewers to read rather than watch, effectively transforming film into “a kind of high-class comic book with sound effects” (Canby, 1983).[2] Certainly, differences in scale affect such attitudes and the quantitative variance between post-subtitles (produced for distribution only) and authorial or diegetic titling (as seen in Sherlock) is pronounced.[3] However, eye tracking research on subtitle processing indicates that, on the whole, viewers easily accommodate the increased cognitive load it presents. Although attentional splitting occurs, leading to an increase in back-and-forth shifts between the subtitles and the rest of the image area (Skarkowska and Kruger, this issue), viewers acclimatise by making shorter fixations than in regular reading and by skipping high-frequency words and subtitles while still managing to register meaning (see d’Ydewalle and de Bruycker, 2007: 199). In this way, subtitle processing reveals many differences to reading of static text, and approximates techniques of visual scanning. Bearing these findings in mind, I propose it is more accurate to see subtitling as transforming reading into viewing and text into image, rather than vice versa.

Situating Sherlock in relation to a range of related TELOP practices across diverse TV genres (such as game shows, panel shows, news broadcasting and dramas) Ryoko Sasamoto (2014: 7) notes that the additional processing effort caused by on-screen text is offset by its editorial function.[4] TELOPs are often deployed by TV producers to guide interpretation and ensure comprehension by selecting and highlighting information deemed most relevant. This suggestion is backed up by research by Rei Matsukawa et al. (2009), which found that the information redundancy effect caused by TELOPs facilitates understanding of TV news. For Sasamoto (2014: 7), ‘impact captioning’ highlights salient information in much the same way as voice intonation or contrastive stress. It acts as a “written prop on screen” enabling “TV producers to achieve their communicative aims… in a highly economical manner” (8). Focusing on Sherlock specifically, Sasamoto suggests that its captioning provides “a route for viewers into complex narratives” (9). Moreover, as Szarkowska and Kruger (this issue) note, in static reading conditions, “longer fixations typically reflect higher cognitive load.” Consequently, the shorter fixations that characterise subtitle viewing supports the contention that on-screen text processing is eased by its expedient, editorial function and by redundancy effects resulting from its multimodality.

Switched On

Another way in which Sherlock’s text and titling innovations extend beyond mobile phone usage was exemplified in July 2013 by a promotional campaign that promised viewers a ‘sneak peak’ at a yet-to-be-released episode title, requiring them to find and piece together a series of clues. In true Sherlockian style, the clues were well hidden, only visible to viewers if they switched on closed-captioning or SDH available for deaf and hard-of-hearing audiences. With this device turned on, viewers encountered intralingual captioning along the bottom of their screen and additionally, individually boxed letters that appeared top left (see Figs. 11 and 12). Viewers needed to gather all these single letter clues in order to deduce the episode title: ‘His Last Vow’. According to the ‘I Heart Subtitles’ blog (July 16, 2013), in doing so, Sherlock once again displayed its ability to “think outside the box and consider all…options”. It also cemented its commitment to on-screen text in various guises, and effectively gave voice to an audience segment typically disregarded in screen commentary and analysis. Through this highly unusual, cryptic campaign, Sherlock alerted viewers to more overtly functional forms of titling, and intimated points of connection between language, textual intervention and access.

Figure 11

Figures 11: Boxed letter clues (top left of frame) that appeared when closed captioning was switched on, during a re-run of ‘A Scandal in Belgravia’, Sherlock (2012), Episode 1, Season 2.

Figure 12

Figures 12: Boxed letter clues (top left of frame) that appeared when closed captioning was switched on, during a re-run of ‘A Scandal in Belgravia’, Sherlock (2012), Episode 1, Season 2.


On-screen text invites a rethinking of the visual, expanding its borders and blurring its definitional clarity. Eye tracking research demonstrates that moving text on screens is processed differently to static text, affected by a range of factors issuing from its multimodal complexity. Sherlock subtly signals such issues through its playful, irreverent deployment of text, which enables viewers to directly access Sherlock’s thoughts and understand his reasoning, while also distancing them, asking them to marvel at his ‘millennial’ technological prowess (Stein and Busse, 2012: 11) while remaining self-consciously aware of his complex narrative framing as it flips inside out, inviting audiences to watch themselves watching. Such diegetic transgression is yet to be mapped through eye tracking, intimating a profitable direction for future studies. To date, data on text and image processing demonstrates how on-screen text attracts eye movement and hence, it can be inferred that it distracts from other parts of the image area. Yet, despite rendering more of the image effectively ‘invisible’, text in the form of TELOPs are increasingly prevalent in news broadcasts, current affairs panel shows (when audience text messages are displayed) and, most notably, in Asian TV genres where they are now a “standard editorial prop” featured in many dramas and game shows (Sasamoto, 2014: 1). In order to take up the challenge presented by such emerging modes of screen address, research needs to move beyond surface assessments of the attraction/distraction nexus. It is the very attraction to TELOP distraction that Sherlock – via eye tracking – brings to the fore.



List of Figures




[1] While some commentators point out that Sherlock was by no means the first to depict text messaging in this way – as floating text on screen – it is this series more than any other that has brought this phenomenon into the limelight. Other notable uses of on-screen text to depict mobile phone messaging occur in films All About Lily Chou-Chou (Iwai, 2001), Disconnect (Rubin, 2013), The Fault in Our Stars (Boone, 2014), LOL (Azuelos, 2012), Non-Stop (Collet-Serra, 2014), Wall Street: Money Never Sleeps (Stone, 2010), and in TV series Glee (Fox, 2009–), House of Cards (Netflix, 2013–), Hollyoaks (Channel 4, 1995–), Married Single Other (ITV, 2010) and Slide (Fox8, 2011). For discussion of some ‘early adopters’, see Biendenharn 2014.



[2] Notably, in this New York Times piece, Canby (1983) actually defends subtitling against this charge, and advocates for subtitling over dubbing.

[3] On distinctions between post-subtitling and pre-subtitling (including diegetic subtitling), see O’Sullivan (2011).

[4] According to Sasamoto (2014: 1), “the use of OCT [Open Caption Telop] as an aid for enhanced viewing experience originated in Japan in 1990.”



Dr Tessa Dwyer teaches Screen Studies at the University of Melbourne, specialising in language politics and issues of screen translation. Her publications have appeared in journals such as The Velvet Light Trap, The Translator and The South Atlantic Quarterly and in a range of anthologies including B is for Bad Cinema (2014), Words, Images and Performances in Translation (2012) and the forthcoming Locating the Voice in Film (2016), Contemporary Publics (2016) and the Routledge Handbook of Audiovisual Translation (2017). In 2008, she co-edited a special issue of Refractory on split screens. She is a member of the ETMI research group and is currently writing a book on error and screen translation.

Subtitles on the Moving Image: an Overview of Eye Tracking Studies – Jan Louis Kruger, Agnieszka Szarkowska and Izabela Krejtz


This article provides an overview of eye tracking studies on subtitling (also known as captioning), and makes recommendations for future cognitive research in the field of audiovisual translation (AVT). We find that most studies in the field that have been conducted to date fail to address the actual processing of verbal information contained in subtitles, and rather focus on the impact of subtitles on viewing behaviour. We also show how eye tracking can be utilised to measure not only the reading of subtitles, but also the impact of stylistic elements such as language usage and technical issues such as the presence of subtitles during shot changes on the cognitive processing of the audiovisual text as a whole. We support our overview with empirical evidence from various eye tracking studies conducted on a number of languages, language combinations, viewing contexts as well as different types of viewers/readers, such as hearing, hard of hearing and Deaf people.


The reading of printed text has received substantial attention from scholars since the 1970s (for an overview of the first two decades see Rayner et al. 1998). Many of these studies, conducted from a psycholinguistic angle, made use of eye tracking. As a result, a large body of knowledge exists on the eye movements during reading of people with varying levels of reading skills and language proficiency, with a range of ages, different first languages and cultural backgrounds, and in different contexts. Studies on subtitle reading, however, have not achieved the same level of scientific rigour largely for practical reasons: subtitles are not static for more than a few seconds at a time; they compete for visual attention with a moving image; and they compete for overall cognitive resources with verbal and non-verbal sounds. This article will identify some of the gaps in current research in the field, and also illustrate how some of these gaps can be bridged.

Studying the reading of subtitles is significantly different from studying the reading of static text. In the first place, as far as eye tracking software is concerned, the subtitles appear on a moving image as image rather than text, which renders traditional text-based reading statistics and software all but useless. This also makes the collection of data for reading research on subtitles a painstakingly slow process involving substantial manual inspections and coding. Secondly, the fact that subtitles appear against the background of the moving image means that they are always in competition with this image, which renders the reading process fundamentally different from the reading process of static texts: on the one hand because the reading of subtitles compete with the processing of the image, sometimes resulting in interrupted reading, but on the other hand the limited time the subtitles are on screen means that readers have less time to reread or regress to study difficult words or to check information. Either way, studying this reading process, and the cognitive processing that takes place during the reading, is much more complicated than in the case of static texts where we know that the reader is mainly focussing on the words before her/him without additional auditory and visual information to process.

While the viewing of subtitles has been the object of many eye tracking studies in recent years, with increasing frequency (see, for example Bisson et al. 2012; d’Ydewalle and Gielen 1992; d’Ydewalle and De Bruycker 2007; Ghia 2012; Krejtz et al. 2013; Kruger 2013; Kruger et al. 2013; Kruger and Steyn 2014; Perego et al. 2010; Rajendran et al. 2013; Specker 2008; Szarkowska et al. 2011; Winke et al. 2013), the study of the reading of subtitles remains a largely uncharted territory with many research avenues still to be explored. Those studies that do venture to measure more than just attention to the subtitle area, seldom do this for extended texts.

In this article we provide an overview of studies on how subtitles change the way viewers process audiovisual material, and also of studies on the unique characteristics of the subtitle reading process. Taking an analysis of the differences between reading printed (static) text and subtitles as point of departure, we examine a number of aspects typical of the way subtitle text is processed in reading. We also look at the impact of the dynamic nature of the text and the competition with other sources of information on the reading process (including scene perception, changes in the viewing process, shifts between subtitles and image, visual saliency of text, faces, and movement, and cognitive load), as well as discussing studies on the impact of graphic elements on subtitle reading (e.g. number of lines, and text chunking), and studies that attempt to measure the subtitle reading process in more detail.

We start off with a discussion of the way in which watching an audiovisual text with subtitles alters viewing behaviour as well as of the complexities of studying subtitles due to the dynamic nature of the image it has as a backdrop. Here we focus on the fleeting nature of the subtitle text, the competition between reading the subtitles and scanning the image, and the interaction between different sources of information. We further discuss internal factors that impact on subtitle processing, like the language and culture of the audience, the language of the subtitles, the degree of access the audience has to sound, and other internal factors, before turning to external factors related to the nature of the audiovisual text and the presentation of the subtitles. Finally, we provide an overview of studies attempting to measure the processing of subtitles as well as findings from two studies that approach the processing of subtitles

The dynamic nature of the subtitle reading process

Reading subtitles differs substantially from reading printed text in a number of respects. As opposed to “static text on a stable background”, the viewer of subtitled audiovisual material is confronted with “fleeting text on a dynamic background” (Kruger and Steyn 2014, 105). In consequence, viewers not only need to process and integrate information from different communication channels (verbal visual, non-verbal visual, verbal auditory, non-verbal auditory, see Gottlieb 1998), but they also have no control over the presentation speed (see Kruger and Steyn 2014; Szarkowska et al. forthcoming). As a consequence, unlike in the reading of static texts, the pace of reading is in part dictated by the text rather than the reader – by the time the text is available to be read – and there is much less time for the reader to regress to an earlier part of a sentence or phrase, and no opportunity to return to previous sentences. Reading traditionally takes place in a limited window which the reader is acutely aware will disappear in a few seconds. Even though there are exceptions to the level of control a viewer has, for example in the case of DVD and PVR as well as other electronic media where the viewer can rewind and forward at will, the typical viewing of subtitles for most audiovisual products happens continuously and without pauses just as when watching live television.

Regressions, which form an important consideration in the reading of static text, take on a different aspect in the context of the knowledge (the viewer has) that dwelling too much on any part of a subtitle may make it difficult to finish reading the subtitle before it disappears. Any subtitle is on screen for between one and six seconds, and the viewer also has to simultaneously process all the other auditory (in the case of hearing audiences) and visual cues. In other words, unlike when reading printed text, reading becomes only one of the cognitive processes the viewer has to juggle in order to understand the audiovisual text as a whole. Some regressions are in fact triggered by the change of the image in shot changes (and to a much lesser extent scene changes) when the text stays on across these boundaries, which means that the viewer sometimes returns to the beginning of the subtitle to check whether it is a new subtitle, and sometimes even re-reads the subtitle. For example, in a recent study, Krejtz et al. (2013) established that participants tend not to re-read subtitles after a shot change or cut. But their data also revealed that a proportion of the participants did return their gaze to the beginning of the subtitle after such a change (see also De Linde and Kay, 1999). What this means for the study of subtitle reading is that these momentary returns (even if only for checking) result in a class of regressions that is not in fact a regression to re-read a word or section, but rather a false initiation of reading for what some viewers initially perceive to be a new sentence.

On the positive side, the fact that subtitles are embedded on a moving image and are accompanied by a soundtrack (in the case of hearing audiences) facilitates the processing of language in context. Unfortunately, this context also introduces competition for attention and cognitive resources. For the Deaf and hard of hearing audience, attention has to be divided between reading the subtitles and processing the scene, extracting information from facial expressions, lip movements and gestures, and matching or checking this against the information obtained in the subtitles. For the hearing audience who makes use of subtitles for support or to provide access to foreign language dialogue, attention is likewise divided between subtitles and the visual scene, and just as the Deaf and hard of hearing audiences have the added demand on their cognitive resources of having to match what they read with what they get from non-verbal signs and lip movements, the hearing audience matches what they read with what they hear, checking for correspondence of information and interpreting intonation, tenor and other non-verbal elements of speech.

What stands beyond doubt is that the appearance of subtitles changes the viewing process. In 2000, Jensema et al. famously stated that “the addition of captions to a video resulted in major changes in eye movement patterns, with the viewing process becoming primarily a reading process” (2000a, 275). Having examined the eye movements of six subjects watching video clips with and without subtitles, they found that the onset of a subtitle triggers a change in the eye movement pattern: when a subtitle appears, viewers move their gaze from whatever they were watching in order to follow the subtitle. In a more wide-scale study it was concluded by d’Ydewalle and de Bruycker (2007,196) that “paying attention to the subtitle at its presentation onset is more or less obligatory and is unaffected by major contextual factors such as the availability of the soundtrack, knowledge of the foreign language in the soundtrack, and important episodic characteristics of actions in the movie: Switching attention from the visual image to “reading” the subtitles happens effortlessly and almost automatically”.

Subtitles therefore appear to be the cause of eye movement bias similar to faces (see Hershler & Hochstein, 2005; Langton, Law, Burton, & Schweinberger, 2008; Yarbus, 1967), the centre of the screen, contrast and movement. In other words, subtitles attract the gaze at least in part because of the fact that the eye is drawn to the words on screen just as the eye is drawn to movement and other elements. Eyes are drawn to subtitles not only because the text is identified as a source of meaningful information (in other words a top-down impulse as the viewer consciously consults the subtitles to obtain relevant information), but also because of the change to the scene that the appearance of a subtitle causes (in other words a bottom-up impulse, automatically drawing the eyes to what has changed on the screen).

As in most other contexts, the degree to which viewers will process the subtitles (i.e. read them rather than merely look at them when they appear and then look away) will be determined by the extent to which they need the subtitles to follow the dialogue or to obtain information on relevant sounds. In studying visual attention to subtitles it therefore remains a priority to measure the degree of processing, something that has not been done in more than a handful of studies, and something to which we will return later in the article.

Viewers usually attend to the image on the screen, but when subtitles appear, it only takes a few frames for most viewers to move their gaze to read the subtitles. The fact that people tend to move their gaze to subtitles the moment they appear on the screen is illustrated in Figures 1 and 2.

Figure. 1 Heat maps of three consecutive film stills – Polish news programme Fakty (TVN) with intralingual subtitles.

Figure. 1 Heat maps of three consecutive film stills – Polish news programme Fakty (TVN) with intralingual subtitles.

Figure 2. Heat maps of two consecutive film stills – Polish news programme Wiadomości (TVP1) with intralingual subtitles

Figure 2. Heat maps of two consecutive film stills – Polish news programme Wiadomości (TVP1) with intralingual subtitles

Likewise, when the gaze of a group of viewers watching an audiovisual text without subtitles is compared to that of a similar group watching the same text with subtitles, the split in attention is immediately visible as the second group reads the subtitles and attends less to the image, as can be seen in Figure 3.

Figure 3. Heat maps of the same scene seen without subtitles and with subtitles – recording of an academic lecture.

Figure 3. Heat maps of the same scene seen without subtitles and with subtitles – recording of an academic lecture.

Viewer-internal factors that impact on subtitle processing

The degree to which the subtitles are processed is far from straightforward. In a study performed at a South African university in the context of Sesotho students looking at a recorded lecture with subtitles in their first language and audio in English (their language of instruction), students were found to avoid looking at the subtitles (see Kruger, Hefer and Matthew, 2013b). Sesotho students in a different group who saw the same lecture with English subtitles processed the subtitles to a much larger extent. This contrast is illustrated in the focus maps in Figures 4.


Figure 4. Focus maps of Sesotho students looking at a lecture in intralingual English subtitles (left) and another group looking at the same lecture with interlingual Sesotho subtitles (right) – recording of an academic lecture.

The difference in eye movement behaviour between the conditions is also evident when considering the number of subtitles skipped. Participants in the above study who saw the video with Sesotho subtitles skipped an average of around 50% of the Sesotho subtitles (median at around 58%), whereas participants who saw the video with English subtitles only skipped an average of around 20% of the English subtitles (with a median of around 8%) (see Kruger, Hefer & Matthew, 2014).

This example does not, however, represent the conventional use of subtitles where viewers would rely on the subtitles to gain access to a text from which they would have been excluded without the subtitles. It does serve to illustrate that subtitle reading is not unproblematic and that more research is needed on the nature of processing in different contexts by different audiences. For example, in a study in Poland, interlingual subtitles (English to Polish) were skipped slightly less often by hearing viewers compared to intralingual subtitles (Polish to Polish), possibly because hearing viewers didn’t need them to follow the plot (see Szarkowska et al., forthcoming).

Another important finding from eye tracking studies on the subtitle process relates to how viewers typically go about reading a subtitle. Jensema et al. (2000) found that in subtitled videos, “there appears to be a general tendency to start by looking at the middle of the screen and then moving the gaze to the beginning of a caption within a fraction of a second. Viewers read the caption and then glance at the video action after they finish reading” (2000, 284). This pattern is indeed often found, as illustrated in the sequence of frames from a short video from our study in Figure 5.

Figure 5. Sequence of typical subtitle reading – a recording of Polish news programme Fakty (TVN) with intralingual subtitles.

Figure 5. Sequence of typical subtitle reading – a recording of Polish news programme Fakty (TVN) with intralingual subtitles.

Some viewers, however, do not read so smoothly and tend to shift their gaze between the image and the subtitles, as demonstrated in Figure 6. The gaze shifts between the image and the subtitle, also referred to in literature as ‘deflections’ (de Linde and Kay 1999) or ‘back-and-forth shifts’ (d’Ydewalle and De Bruycker (2007), can be regarded as an indication of the smoothness of the subtitle reading process: the fewer the gaze shifts, the more fluent the reading and vice versa.

Figure 6. Scanpath of frequent gaze shifting between text and image – a recording of Polish news programme Fakty (TVN) with intralingual subtitles.

Figure 6. Scanpath of frequent gaze shifting between text and image – a recording of Polish news programme Fakty (TVN) with intralingual subtitles.

An important factor that influences subtitle reading patterns is the nature of the audience. In Figure 7 an interesting difference is shown between the way a Deaf and a hard of hearing viewer watched a subtitled video. The Deaf viewer moved her gaze from the centre of the screen to read the subtitle and then, after having read the subtitle, returned the gaze to the centre of the screen. In contrast, the hard of hearing viewer made constant comparisons between the subtitles and the image, possibly relying on residual hearing and trying to support the subtitle reading process with lip-reading. Such a result was reported by Szarkowska et al. (2011), who found differences in the number of gaze shifts between the subtitles and the image in the verbatim subtitles condition, particularly discernible (and statistically significant) in the hard of hearing group (when compared to the hearing and Deaf groups).

Figure 7. Scanpaths of Deaf and hard of hearing viewers. Left: Gaze plot illustrating the viewing pattern of a Deaf participant watching a clip with verbatim subtitles.  Right: Gaze plot illustrating the viewing pattern of a hard of hearing participant watching a clip with verbatim subtitles.

Figure 7. Scanpaths of Deaf and hard of hearing viewers. Left: Gaze plot illustrating the viewing pattern of a Deaf participant watching a clip with verbatim subtitles. Right: Gaze plot illustrating the viewing pattern of a hard of hearing participant watching a clip with verbatim subtitles.

These provisional qualitative indications of differences between eye movements of users with different profiles require more in-depth quantitative investigation and the subsequent section will provide a few steps in this direction.

As mentioned above, subtitle reading patterns largely depend on the type of viewers. Fluent readers have been found to have no difficulty following subtitles. Diao et al. (2007), for example, found a direct correlation between the impact of subtitles on learning and the academic and literacy levels of participants. Similarly, given that “hearing status and literacy tend to covary” (Burnham et al. 2008, 392), some previous studies found important differences in the way hearing and hearing-impaired people watch subtitled programmes. Robson (2004, 21) notes that “regardless of their intelligence, if English is their second language (after sign language), they [i.e. Deaf people] cannot be expected to have the same comprehension levels as hearing people who grew up exposed to English”. This is indeed confirmed by Szarkowska et al. (forthcoming) who report that Deaf and hard of hearing viewers in their study made more fixations on subtitles and that their dwell time on the subtitles was longer compared to hearing viewers. This result may indicate a larger effort needed to process subtitled content and more difficulty in extracting information (see Holmqvist et al. 2011, 387-388). This, in turn, may stem from the fact that for some Deaf people the language in the subtitles is not their mother tongue (their L1 being sign language). At the same time, for hearing-impaired viewers, subtitles provide an important source of information on the words spoken in the audiovisual text as well as other information contained in the audio track, which in itself explains the fact that they would spend more time looking at the subtitles.

Viewer-external factors that impact on subtitle processing

The ‘smoothness’ of the subtitle reading process depends on a number of factors, including the nature of the audiovisual material as well as technical and graphical aspects of subtitles themselves. At a general level, genre has an impact on both the role of subtitles in the total viewing experience, and on the way viewers process the subtitles. For example, d’Ydewalle and Van Rensbergen (1989) found that children in Grade 2 paid less attention to subtitles if a film involved a lot of action (see d’Ydewalle & Bruycker 2007 for a discussion). The reasons for this could simply be that action film tends to have less dialogue in the first place, but secondly and more significantly, the pace of the visual editing and the use of special effects creates a stronger visual element which shifts the balance of content towards the action (visual content) and away from dialogue (soundtrack and therefore subtitles). This, however, is an area that has to be investigated empirically. At a more specific level, technical characteristics of an audiovisual text such as film editing have an impact on the processing of subtitles.

1 Film editing

Film editing has a strong influence on the way people read subtitles, even beyond the difference in editing pace as a result of genre (for example, action and experimental films could typically be said to have a higher editing pace than dramas and documentaries). In terms of audience perception, viewers have been found to be unaware of standard film editing techniques (such as continuity editing) and are thus able to perceive film as a continuous whole in spite of numerous cuts – the phenomenon termed “edit blindness” (Smith & Henderson, 2008, 2). With more erratic and fast-paced editing, it stands to reason that the cognitive demands will increase as viewers have to work harder to sustain the illusion of a continuous whole.

When subtitles clash with editing such as cuts (i.e. if subtitles stay on screen over a shot or scene change), conventional wisdom as passed on by generations of subtitling guides (see Díaz Cintas & Remael 2007, ITC Guidance on Standards for Subtitling 1999) suggests that the viewer will assume that the subtitle has changed with the image and as a consequence they will re-read it (see above). However, Krejtz et al. (2013) reported that subtitles displayed over shot changes are more likely to cause perceptual confusion by making viewers shift their gaze between the subtitle and the rest of the image more frequently than subtitles which do not cross film cuts (cf. de Linde and Kay 1999). As such, the cognitive load is bound to increase.

2 Text chunking and line segmentation

Another conventional wisdom, perpetuated in subtitling guidelines and standards, is that poor line segmentation will result in less efficient processing (see Díaz Cintas & Remael 2007, Karamitroglou 1998). In other words, subtitles should be chunked per line and between subtitles in terms of self-contained semantic units. The line of dialogue: “He told me that he would meet me at the red mailbox” should therefore be segmented in something like the following ways:

He told me he would meet me
at the red mailbox.


He told me
he would meet me at the red mailbox.

Neither of the following segmentations would be optimal because the prepositional phrase ‘at the red mailbox’ and the verb phrase ‘he would meet me’, respectively, are split, which is considered an error:

He told me he would meet me at the
red mailbox

He told me he
would meet me at the red mailbox.

However, Perego et al. (2010) found that poor line segmentation in two-line subtitles did not affect subtitle comprehension negatively. They also investigated 28 subtitles viewed by 16 participants using a threshold line between the subtitle region and the upper part of the screen, or main film zone, but did not find a statistically significant difference between the well-segmented and ill-segmented subtitles in terms of fixation counts, total fixation time, or number of shifts between subtitle region and upper area. The only statistically significant difference they found was between the mean fixation duration within the subtitle area between the two conditions, with the mean fixation duration in the ill-segmented subtitles being on average 12ms longer than in the well-segmented subtitles. Although the authors downplay the importance of this difference on the grounds that the difference is so small, it does seem to indicate at least a slightly higher cognitive load when the subtitles are ill-segmented. The small number of subtitles and participants, however, make it difficult to generalize from their results, again a result of the fact that it is difficult to extract reading statistics for subtitles unless the reading behaviour can be quantified over longer audiovisual texts.

In a study conducted a few years later, Rajendran et al. (2013) found that “chunking improves the viewing experience by reducing the amount of time spent on reading subtitles” (2013, 5). This study compared conditions different from those investigated in the previous study, excluding the ill-segmented condition of Perego et al. (2010), and focused mostly on live subtitling with respeaking. In the earlier study, which focused on pre-recorded subtitling, the subtitles in the two conditions were essentially still part of one sense unit that appeared as one two-line subtitle. In the later study, the conditions were chunked by phrase (similar to the well-segmented condition of the earlier study but with phrases appearing one by one on one line), no segmentation (where the subtitle area was filled with as much text as possible with no attempt at segmentation), word by word (where words appeared one by one) and chunked by sentence (where the sentences showed up one by one). Regardless of the fact that this later study therefore essentially investigated different conditions, they did find that the most disruptive condition was where the subtitle appeared word by word – eliciting more gaze points (defined less strictly than in fixation algorithms used by commercial eye trackers) and more “saccadic crossovers” or switches between image and subtitle area. However, in this study by Rajendran et al. (2013), the videos were extremely short (under a minute), and the sound was muted, hampering the ecological validity of the material, and once again making the findings less suitable to generalization.

Although both these studies have limitations in terms of generalizability, they both provide some indication that segmentation has an impact on subtitle processing. Future studies will nonetheless have to investigate this aspect over longer videos to determine whether the graphical appearance, and particularly the segmentation of subtitles, has a detrimental effect on subtitle processing in terms of cognitive load and effectiveness.

3 Language

The language of subtitles has received considerable attention from psycholinguists in the context of subtitle reading. D’Ydewalle and de Bruycker (2007) examined eye movement behaviour of people reading standard interlingual subtitles (with the audio track in a foreign language and subtitles in their native language) and reversed subtitles (with the audio in their mother tongue and subtitles in a foreign language). They found more regular reading patterns in the standard interlingual subtitling condition, with the reversed subtitling condition having more subtitles skipped, fewer fixations per subtitle, etc. (see also d’Ydewalle and de Bruycker 2003 and Pavakanun 1993). This is an interesting finding in itself, as it is the reversed subtitling that has been found to be particularly conducive to foreign language learning (see Díaz Cintas and Fernández Cruz 2008, and Vanderplank 1988).

Szarkowska et al. (forthcoming) examined differences in reading patterns of intralingual (Polish to Polish) and interlingual (English to Polish) subtitles among a group of Deaf, hard of hearing and hearing viewers. They found no differences in reading for the Deaf and hard of hearing audiences, but hearing people made significantly more fixations to subtitles when watching English clips with interlingual Polish subtitles than Polish clips with intralingual Polish subtitles. This confirms that the hearing viewers processed the subtitles to a significantly lower degree when they were redundant, as in the case of intralingual transcriptions of the soundtrack. What would be interesting to investigate in this context is those instances when the hearing audience did in fact read the subtitles, to determine to what extent and under what circumstances the redundant written information is used by viewers to support their auditory intake of information.

In a study on the influence of translation strategies on subtitle reading, Ghia (2012) investigated the differences in the processing of literal vs. non-literal translations into Italian of an English film clip (6 minutes) when watched by Italian EFL learners. According to Ghia, just as subtitle format, layout, and segmentation have the potential to affect visual and perceptual dynamics, the relationship translation establishes with the original text means that “subtitle translation is also likely to influence the perception of the audiovisual product and viewers’ general reading patterns” (2012,175). Ghia particularly wanted to investigate the processing of different translation strategies in the presence of sound and image with the subtitles. In her study she found that the non-literal translations (where the target text diverged from the source text) resulted in more deflections between text and image. This is similar to the findings of Rajendran et al. (2013) in terms of less fluent graphics in word-by-word subtitles.

As can be seen from the above, the aspect of language processing in the context of subtitled audiovisual texts has received some attention, but has not to date been approached in any comprehensive manner. In particular, there is a need for more psycholinguistic studies to determine how subtitle reading differs from the reading of static text, and how this knowledge can be applied to the practice of subtitling.

Measuring subtitle processing

1 Attention distribution and presentation speed

In the study by Jensema et al. (2000), subjects spent on average 84% of the time looking at subtitles, 14% at the video picture and 2% outside of the frame. The study represents an important early attempt to identify reading patterns in subtitle reading, but it has considerable limitations. The study had only six participants, three deaf and three hearing, and the video clips were extremely short (around 11 seconds each), presented with English subtitles (in upper case) without sound. The fact that there was no soundtrack therefore impacted on the time spent on the subtitles. In Perego et al’s study (2010), the ratio is reported as 67% on the subtitle area and 33% on the image. In this study there were 41 Italian participants who watched a 15-minute clip with Hungarian soundtrack and subtitles in Italian. As in the previous study, the audience therefore had to rely heavily on the subtitles in order to follow the dialogue. Kruger et al. (2014), in the context of intralingual subtitles in a Psychology lecture in English, found a ratio of 43% on subtitles, 43% on the speaker and slides and 14% on the rest of the screen. When the same lecture was subtitled into Sotho, the ratio changed to 20% on the subtitles, 66% on the speaker and slides, and 14% on the rest of the screen. This wide range is an indication of the difference in the distribution of visual attention in different contexts with different language combinations, different levels of redundancy of information, and differences in audiences.

In order to account for “the audiovisual nature of subtitled programmes”, Romero-Fresco (in press) puts forward the notion of ‘viewing speed’ – as opposed to reading speed and subtitling speed – which he defines as “the speed at which a given viewer watches a piece of audiovisual material, which in the case of subtitling includes accessing the subtitle, the accompanying images and the sound, if available”. The perception of subtitled programmes is therefore a result of not only the subtitle reading patterns, but also the visual elements of the film. Based on the analysis of over seventy-one thousand subtitles created in the course of the Digital Television for All project, Romero Fresco provides the following data on the viewing speed, reflecting the proportion of time spent by viewers looking at subtitles and at the images, proportional to the subtitle presentation rates (see Table 1).

Viewing speed Time on subtitles Time on images
120wpm ±40% ±60%
150wpm ±50% ±50%
180wpm ±60%-70% ±40%-30%
200wpm ±80% ±20%

Table 1. Viewing speed and distribution of gaze between subtitles and images (Romero-Fresco) 

Jensema et al. also suggested that the subtitle presentation rate may have an influence on the time spent reading subtitles vs. watching the rest of the image: “higher captioning speed results in more time spent reading captions on a video segment” (2000, 275). This was later confirmed by Szarkowska et al. (2011), who found that viewers spent more time on verbatim subtitles displayed at higher presentation rates compared to edited subtitles displayed with low reading speed, as illustrated by Figure 8.

Figure 8. Fixation-count based heatmaps illustrating changes in attention allocation of hearing and Deaf viewers watching videos subtitled at different rates.

Figure 8. Fixation-count based heatmaps illustrating changes in attention allocation of hearing and Deaf viewers watching videos subtitled at different rates.

2 Mean fixation duration

Irwin (2004, 94) states that “fixation location corresponds to the spatial locus of cognitive processing and that fixation or gaze duration corresponds to the duration of cognitive processing of the material located at fixation”. Within the same activity (e.g. reading), longer mean fixation durations could therefore be said to reflect more cognitive processing and higher cognitive load. One would therefore expect viewers to have longer fixations when the subject matter is more difficult, or when the language is more specialized. Across activities, however, comparisons of fixation duration is less meaningful as reading elicits more shorter fixations than scene perception or visual scanning simply because of the nature of the activities. It is therefore essential in eye tracking studies of subtitle reading to distinguish between the actual subtitles when they are on screen, the rest of the screen, and the subtitle area when there is no text (between successive subtitles).

The difference between reading and scene perception is illustrated in Figure 9, demonstrating that fixations on the image tend to be longer (indicated here by a bigger circle) than those on subtitles (which indicates more focused viewing), and more exploratory in nature (see the distinction between focal and ambient fixations in Velichkovsky et al. 2005).

Figure 9. Differences in fixation durations between the image and subtitle text – from Polish TV series Londyńczycy.

Figure 9. Differences in fixation durations between the image and subtitle text – from Polish TV series Londyńczycy.

Rayner (1984) indicated the impact of different tasks on mean fixation durations, as reflected in Table 2 below:

Task Mean fixation duration (ms) Mean saccade size (degrees)
Silent reading 225 2 (about 8 letters)
Oral reading 275 1.5 (about 6 letters)
Visual search 275 3
Scene perception 330 4
Music reading 375 1
Typing 400 1 (about 4 letters)

 Table 2. Approximate Mean Fixation Duration and Saccade Length in Reading, Visual Search, Scene Perception, Music Reading, and Typing[1]

In subtitling, silent reading is accompanied by simultaneous processing of the same information in the soundtrack (in the same or another language) as well as of other sounds and visual signs (for a hearing audience, that is – for a Deaf audience, it would be text and visual signs). The difference in mean fixation duration in these different tasks therefore reflects the difference in cognitive load. In silent reading of static text, there is no external competition for cognitive resources. When reading out loud, the speaker/reader inevitably monitor his/her own reading, introducing additional cognitive load. As the nature of the sign becomes more abstract, the load, and the fixation duration increases, and in the case of typing, different processing, production and checking activities are performed simultaneously, resulting in even higher cognitive load. This is inevitably an oversimplification of cognitive load, and indeed the nature of information acquisition between reading successive groups of letters (words) in a linear fashion is significantly different from that of scanning a visual scene for cues.

Undoubtedly, subtitle reading imposes different cognitive demands, and these demands are also very much dependent on the audience. In an extensive study on the differences in subtitle reading between Deaf, hard of hearing and hearing participants, we found a high degree of variation in mean fixation duration between the groups, and also a difference between the mean fixation duration in the Deaf and the hard of hearing groups between subtitles presented at 12 characters per second and 15 characters per second (see Szarkowska et al. forthcoming).

  12 characters per second 15 characters per second
Deaf 241.93 ms 232.82 ms
Hard of hearing 218.51 ms 214.78 ms
Hearing 186.66 ms 186.58 ms

Table 3. Differences in reading subtitles presented at different rates

Statistical analyses performed on the three groups with mean fixation duration as a dependent variable and groups and speed as categorical factors produced a statistically significant main effect, further confirmed by subsequent t-tests that yielded statistically significant differences in mean fixation duration for both subtitling speeds between all three groups. The difference within the Deaf and hard of hearing groups was also significant between 12cps and 15cps. What this suggests is that reading speed has a more pronounced effect on Deaf and hard of hearing viewers than on hearing ones.

3 Subtitle reading

As indicated at the outset, one of the biggest hurdles in studying the processing of subtitles is the fact that the subtitles appear as image on image rather than text on image as far as eye tracking analysis software is concerned. Whereas reading statistics software can therefore automatically mark words as areas of interest in static texts, and then calculate number of regressions, refixations, saccade length, fixation duration and count as related to the specific words, this process has to be done manually for subtitles. The fact that it is virtually impossible to create similar areas of interest on the subtitle words that are embedded in the image over large numbers of subtitles makes it very difficult to obtain reliable eye tracking results on subtitles as text. This explains the predominance of measures such as fixation count and fixation duration as well as shifts between subtitle area and image in eye tracking studies on subtitle processing. As a result, many of these studies do not distinguish directly between looking at the subtitle area and reading the subtitles, and, “they tend to define crude areas of interest (AOIs), such as the entire subtitle area, which means that eye movement data are also collected for the subtitle area when there are no subtitles on screen, which further skews the data” (Kruger and Steyn, 2014, 109).

Although a handful of studies come closer to studying subtitle reading by going beyond the study of fixation counts, mean fixation duration, and shifts between subtitle area and image area, most studies tend to focus on amount of attention rather than nature of attention. Briefly, the exceptions can be identified in the following studies: Specker (2008) looks at consecutive fixations; Perego et al. (2010) add the path length (sum of saccade lengths in pixels) to the more conventional measures; Rajendran et al. (2013) add the proportion of gaze points; Ghia (2012) looks at fixations on specific words as well as regressions; Bisson et al. (2012) look at the number of subtitles skipped, and proportion of successive fixations (number of successive fixations divided by total number of fixations); and in one of the most comprehensive studies on the subject of subtitle processing, d’Ydewalle and De Bruycker (2007) look at attention allocation (percentage of skipped subtitles, latency time, and percentage of time spent in the subtitle area), fixations (number, duration, and word-fixation probability), and saccades (saccade amplitude, percentage of regressive eye movements, and number of back-and-forth shifts between visual image and subtitle).

In a recent study, Kruger and Steyn (2014) provide a reading index for dynamic texts (RIDT) designed specifically to measure the degree of reading that takes place when subtitled material is viewed. This index is explained as “a product of the number of unique fixations per standard word in any given subtitle by each individual viewer and the average forward saccade length of the viewer on this subtitle per length of the standard word in the text as a whole” (2014, 110). Taking the location and start time of successive fixations within the subtitle area when a subtitle is present as the point of departure, the number of unique fixations (i.e. excluding refixations, and fixations following a regression) is determined, as well as the average length of forward saccades in the subtitle. This information gives an indication of the meaningful processing of the words in the subtitle when the number of fixations per word, as well as the length of saccades as ratio of the length of the average word in the audiovisual text are calculated. Essentially, the formula quantifies the reading of a particular subtitle by a particular participant by measuring the eye movement during subtitle reading against what is known about eye movements during reading and perceptual span.

In a little more detail, the formula can be written as follows for video v, with participant p viewing subtitle s”:


(Kruger and Steyn, 2014, 110).

This index was validated by performing a comparison of the manual inspection of the reading of 145 subtitles by 17 participants, and makes it possible to study the reading of subtitles over extended texts. In their study, Kruger and Steyn (2014) use the index to determine the relationship between subtitle reading and performance in an academic context, finding a significant positive correlation between the degree to which participants read the subtitles and their performance in a test written after watching subtitled lectures. The RIDT therefore presents a robust index of the degree to which subtitles are processed over extended texts, and could add significant value to psycholinguistic studies on subtitles. Using the index, previous claims that subtitles have a positive or negative impact on comprehension, vocabulary acquisition, language learning or other dependent variables, can be correlated with whether or not viewers actually read the subtitles, and to what extent the subtitles were read.


From this overview of studies investigating the processing of subtitles on the moving image it should be clear that much still needs to be done to gain a better understanding of the impact of various independent variables on subtitle processing. The complexity of the multimodal text, and in particular the competition between different sources of information, means that a subtitled audiovisual text is a substantially altered product from a cognitive perspective. Much progress has been made in coming to grips with the way different viewers behave when looking at subtitled audiovisual texts, but there are still more questions than answers – relating, for instance, to differences in how people process subtitled content on various devices (cf. the HBBTV4ALL project). The use of physiological measures like eye tracking and EEG (see Kruger et al. 2014) in combination with subjective measures like post-report questionnaires is, however, continually bringing us closer to understanding the impact of audiovisual translation like subtitling on the experience and processing of audiovisual texts.



This study was partially supported by research grant No. IP2011 053471 “Subtitling for the deaf and hard of hearing on digital television” from the Polish Ministry of Science and Higher Education for the years 2011–2014.



[1] Values are taken from a number of sources and vary depending on a number of factors (see Rayner, 1984)



Jan-Louis Kruger is director of translation and interpreting in the Department of Linguistics at Macquarie University in Sydney, Australia.  He holds a PhD in English on the translation of narrative point of view. His main research interests include studies on the reception and cognitive processing of audiovisual translation products including aspects such as cognitive load, comprehension, attention allocation, and psychological immersion.

Agnieszka Szarkowska, PhD, is Assistant Professor in the Institute of Applied Linguistics at the University of Warsaw, Poland. She is the founder and head of the Audiovisual Translation Lab, a research group working on media accessibility. Her main research interests lies in audiovisual translation, especially subtitling for the deaf and the hard of hearing and audio description.

Izabela Krejtz, PhD, is Assistant Professor at University of Social Sciences and Humanities, Warsaw. She is a co-founder of Eyetracking Research Center at USSH. Her research interests include neurocognitive and educational psychology. Her applied work focuses on pro-positive trainings of attention control, eye tracking studies in perception of audiovisual material and emotions regulation.