Volume 26


  1. “Children should play with dead things”: transforming Frankenstein in Tim Burton’s Frankenweenie –  Erin Hawley
  2. “You gave me no choice”: A queer reading of Mordred’s journey to villainy and struggle for identity in BBC’s Merlin  –  Joseph Brennan
  3. Days of YouTube-ing Days of Heaven: Participatory Culture and the Fan Trailer  –  Kyle R. McDaniel
  4. When a Good Girl Goes to War: Claire Adams Mackinnon and Her Service During World War IHeather L. Robinson 
  5. ‘Rock‘n’roll’s evil doll’: the Female Popular Music Genre of Barbie Rock  –  Rock Chugg
  6. Morality, Mortality and Materialism: an Art Historian Watches Mad Men – Catherine Wilkins
  7. Playing At Work  –  Samuel Tobin
  8. 1970s Disaster Films: The Star In Jeopardy Nathan Smith



How We Came To Eye Tracking Animation: A Cross-Disciplinary Approach to Researching the Moving Image – Craig Batty, Claire Perkins, & Jodi Sita


In this article, three researchers from a large cross-disciplinary team reflect on their individual experiences of a pilot study in the field of eye tracking and the moving image. The study – now concluded – employed a montage sequence from the Pixar film Up (2009) to determine the impact of narrative cues on gaze behaviour. In the study, the researchers’ interest in narrative was underpinned by a broader concern with the interaction of top-down (cognitive) and bottom-up (salient) factors in directing viewers’ eye movements. This article provides three distinct but interconnected reflections on what the aims, process and results of the pilot study demonstrate about how eye tracking the moving image can expand methods and knowledge across the three disciplines of screenwriting, screen theory and eye tracking. It is in this way both an article about eye tracking, animation and narrative, and also a broader consideration of cross-disciplinary research methodologies.



Over the past 18 months, a team of cross-disciplinary researchers has undertaken a pilot eye tracking and the moving image study that has sought to understand where spectators look when viewing animation.[i] The original study employed eye tracking methods to record the gaze of 12 subjects. It used a Tobii X120 (Tobii Technology, 2005) remote eye tracking device which allowed viewers to watch the animation sequence on a widescreen PC monitor at 25 frames per second, with sound. The eye tracker pairs the movements of the eye over the screen with the stimuli being viewed by the participant. For each scene viewed, the researchers selected areas of interest; and for these areas, all of the gaze data, including the number and duration of each fixation, was collected and analysed.

Using a well-known montage sequence from the Pixar film Up! (2009), this pilot study focussed on narrative with the aim of discerning whether story cues were instrumental in directing spectator gaze. Focussing on narrative seemed to be useful in that as well as being an original line of enquiry in the eye tracking context, it also offered a natural connection between each of our disciplines and research experiences. The study did not take into account emotional and physiological responses from its participants as a way of discerning their narrative comprehension. Nevertheless, what we found from our data was that characters (especially their faces), key (narrative) objects and visual/scenic repetition seemed to be core factors in determining where they looked.[ii]

In the context of a montage sequence that spans around 60 years of story time, in which the death of the protagonist’s wife sets up the physical and emotional stakes of the rest of the film, it was clear that narrative meaning relating to a character’s journey/arc is important to viewers, more so (in this study) than peripheral action or visual style, for example. With regards to animation specifically, a form ‘particularly equipped to play out narratives that solicit […] emotions because of its capacity to illustrate and enhance interior states, and to express feeling that is beyond the realms of words to properly capture’ (Wells, 2007: 127), the highly controlled nature of the sequence from which the data was drawn seems to suggest that animation embraces narrative techniques fully to control viewer attention.

In this article, three researchers from the team – A, a screenwriter, B, a screen scholar and C, an eye tracking neuroscientist – discuss the approaches they took to conducting this study. Each of us came to the project armed with different expertise, different priorities and a different set of expectations for what we might find out, which we could then take back to our individual disciplines. In this article, then, we purposely use three voices as way of teasing out our understandings before, during and after the study, with the aim of better understanding the potential for cross-disciplinary research in this area. Although other studies in eye tracking and the moving image have been undertaken and reported on, we suggest that using animation with a strongly directed narrative as a test study provides new information. Furthermore, few other studies to date have brought together traditional and creative practice researchers in this way.

What we present, then, is a series of interconnected discussions that draw together ideas from each researcher’s community of thought and practice, guided by the overriding question: how did this study embrace methodological originality and yield innovative findings that might be important to the disciplines of eye tracking and moving image studies? We present these discussions in the format of individual reflections, as a way of highlighting each researcher’s contributions to the study, and in the hope that others will see the potential of disciplinary knowledge in a study such as this one.

How ‘looking’ features in our disciplines, and what we might expect to ‘see’

Researcher A: ‘Looking’ in screenwriting means two things: seeing and reflecting on. By this I mean that a viewer looks at the screen to see what is happening, whilst at the same time reflecting on what they are looking at from on a personal, cultural and/or political level. Some screenwriters focus on theme from the outset: on what they want their work to ‘say’ (see Batty, 2013); some screenwriters focus on plot: on what viewers will see (action) (see Vogler, 2007). What connects these is character. In Aristotelian terms, a character does and therefore is (Aristotle, 1996); for Egri, a character is and therefore does (Egri, 2004). The link here is that what we see on the screen (action) is always performed by a character, meaning that through a process of agency, actions are given meaning, feeding into the controlling theme(s) of the text. In this way, looking at – or seeing – is tied closely to understanding and the feelings that we bring to a text. As Hockley (2007) says, viewers are sutured into the text on an emotional level, connecting them and the text through the psychology of story space.

What we ‘see’, then, is meaning. In other words, we do not just see but we also feel. We look for visual cues that help us to understand the narrative unfolding before our eyes. With sound used to point to particular visual aspects and heighten our emotional states, we bestow energy and emotion in the visuality of the screen, in the hope that we will arrive at an understanding. As this study has revealed, examples include symbolic objects in the frame (the adventure book; the savings jar; the picture of Paradise Falls) that have narrative value in screenwriting because of the meaning they possess (Batty and Waldeback, 2008: 52-3). By seeing these objects repeated throughout the montage, we understand what they mean (to the characters and to the story) and glean a sense of how they will re-appear throughout the rest of the film as a way of representing the emotional space of the story.

Landscape is also something we see, though this is always in the context of the story world (see Harper and Rayner, 2010; Stadler, 2010). In other words, where is this place? What happens here? What cannot happen here? Characters belong to a story world, and therefore landscape also helps us to understand the situations in which we find them. This, again, draws us back to action, agency and theme: when we see landscape, we are in fact understanding why the screenwriter chose put their characters – and us, the audience – there in the first place.

Researcher B: In screen theory, looking is never just looking – never innocent and immediate. The act of looking is the gateway to the experience and knowledge of what is seen on screen, but also of how that encounter reflects the world beyond the screen and our place within it. Looking is over determined as gazing, knowing and being, endlessly charged by the coincidence of eye and I and of real and reel. Psychoanalytic theory imagines the screen as mirror and our identity as a spectatorial effect of recognizing ourselves in the characters and situations that unfold upon it, however refracted. Reception studies seeks out how conversely real individuals encounter content on screen, and how meaning sparks in that meeting—invented anew with every pair of eyes. Television studies emerges from an understanding of a fundamental schism in looking: where the cinematic apparatus enables a gaze, the televisual counterpart can (traditionally) only produce a broken and distracted glance.

All of these theories begin with the act of looking, and are enabled by it in their metaphors, methods and practices. But in no instance is looking attended to as anatomical vision – the process of the “meat and bones” body and brain rather than the metaphysical consciousness. As a scholar of screen theory, my base interest in eye tracking comes down to this “problem”. Is it a problem? Should the biology and theory of looking align? What effects and contradictions arise when they are brought together?

Phenomenological screen theory is a key and complex pathway into this debate, as an approach that values embodied experience, but discredits the ocular—seeking to bring the whole body to spectatorship rather than privilege the centred and distant subject of optical visuality (Marks, 2002: xvi). Vivian Sobchack names film ‘an expression of experience by experience … an act of seeing that makes itself seen, an act of hearing that makes itself heard’ (Sobchack, 1992: 3). Eye tracking shows us the act of seeing – the raw fixations and movements with which screen content is taken in. In the study under discussion here it is this data that is of central interest, with our key questions deriving from what such material can verify about how narrative shapes gaze behaviour. A central question and challenge for me moving forward in this field, though, is to consider this process without ceding to ocularcentrism: that is, without automatically equating seeing to knowing. This ultimately means being cautious about reading gaze behaviour as ‘proof’ of what viewing subjects are thinking, feeling and understanding. This approach will be supported by the inclusion of further physiological measurements.

Researcher C: Interest in vision and how we see the world is an age-old interest, where it has been commonly held that the eyes are the windows to the mind. Where we look is then of great importance, as learning this offers us opportunities to understand more about where the brain wants to spend its time. Human eyes move independently from our heads and so our eyes have developed a specialised operating systems that both allows our eyes to move around our visual environment, and also counteract any movements the head may be making. This has led to a distinct set of eye movements we can study – saccades (the very fast blasts of movement that pivot our eye from focus point to focus point) – and fixations (brief moments of relative stillness where our gaze stops for a moment to allow the receptors in our eye to collect visual information). In addition, only a tiny area of the back of our eyeball, the fovea on the retina, is sensitive enough to gather highly ‘acuitive’ information, thus the brain must drive the eye around precisely in order to get light to fall onto this tiny area of the eye. As such, our eyes movements are an integral and essential part of our vision system.

Eye movement research has seen great advances during the last 50 years, with many early questions examined in the classic work of Buswell (1935) and Yarbus (1967). One question visual scientists and neuroscientists have been, and are still keen to, explore is why we look where we do: what is it about the objects or scene that draws our visual attention? Research over the decades has found that several different aspects are involved, relating to object salience, recognition, movement and contextual value (see Schütz et al., 2011). For animations that are used for learning purposes, Schnotz and Lowe (2008) discussed two major contributing factors that influence the attention-grabbing properties of features that make up this form. One is visuospatial contrast and a second is dynamic contrast; with features that are relatively large, brightly coloured or centrally placed, more likely to be fixated on compared to their less distinctive neighbours; and features that move or change over time drawing more attention.

Eye tracking research, which is now easier than ever to conduct, allows us to delve into examining how these and other features influence us, and is a unique way to gain access to the windows of the mind. Directing this focus to learning more about how we watch films, and in particular to animation, is what drove me to wanting to use eye tracking to better see how people experience these; and to delve into questions such as, what are people drawn to look at, and how might things like the narrative affect the way we direct our gaze?

When looking around a visual world, our view is often full of different objects and we tend to drive our gaze to them so we can recognize, inspect or use them. Not so surprisingly, what we are doing (our task at hand) strongly affects how we direct our gaze; such that as we perform a task, our salience-based mechanisms seem to go offline as people almost exclusively fixate on the task-relevant objects (Hayhoe, 2000; Land et al., 1999). From this, one expectation we have when considering how viewers watch animation is that more than salient features, aspects relating to the narrative components of the viewer’s understanding of the story will be the stronger drive. Another well-known drawcard for visual attention is towards faces, which tend to draw the eye’s attention very strongly (Cerf et al., 2009; Crouzet et al., 2010). For animated films we were interested to see if similar effects would be observed.

Finally, another strong and interesting effect that has been discussed is a tendency for people to have a central viewing bias, in which a large effect on viewing behaviour has been shown to be that people tend to fixate in the centre of a display (Tatler and Vincent, 2009). As this study was moving image screen based, we were keen to compare different scenes and how the narrative affected this tendency.

How we came to the project, and what we thought it might reveal

Researcher A: From a screenwriting perspective, I was excited to think that at last, we might have data that not only privileges the story (i.e., the screenwriter’s input), but that also highlights the minutiae of a scene that the screenwriter is likely to have influenced. This can be different in animation than in live action, whereby a team of story designers and animators actively shape the narrative as the ‘script’ emerges (see Wells, 2010). Nevertheless, if we follow that what we see on screen has been imagined or at least intended by a ‘writer’ of sorts – someone who knows about the composition of screen narratives – then it was rousing to think that this study might provide ‘evidence’ to support long-standing questions (for myself at least) of writing for the screen and authorship. Screenwriters work in layers, building a screenplay from broad aspects such as plot, character and theme, to micro aspects such as scene rhythm, dialogue and visual cues. Being able to ‘prove’ what viewers are looking at, and hoping that this might correlate with a screenwriting perspective of scene composition, was very appealing to me.

I was also interested in what other aspects of the screen viewers might look at, either as glances or as gazes. In some genres of screenwriting, such as comedy, much of the clever work comes around the edges: background characters; ironic landscapes; peripheral visual gags, etc. From a screenwriting perspective, then, it was exciting to think that we might find ways to trace who looks at what, and if indeed the texture of a screenplay is acknowledged by the viewer. The study would be limited and not all aspects could be explored, but as a general method for screen analysis, simply having ideas about what might be revealed led to some very interesting discussions within the team.

Researcher B: All screen theories rest upon a fundamental assumption that different types of content, and different viewing situations, produce different viewing behaviours and effects. Laura Mulvey’s famous theory of the gaze stipulates that classical Hollywood cinema and the traditional exhibition environment (dark cinema, large screen, audience silence) position men as bearers of the look and women as objects of the look, and that avant-garde cinemas avoid this configuration (Mulvey, 1975). New theories of digital cinema speculate upon whether a spectator’s identification with an image is altered when it bears no indexical connection to reality; that is, when the image is a simulated collection of pixels rather than the trace of an event that once took place before a camera (Rodowick, 2007). The phenomenological film theory of Laura Marks suggests that certain kinds of video and multimedia work can engender haptic visuality, where the eyes function like ‘organs of touch’ and the viewer’s body is more obviously involved in the process of seeing that is the case with optical visuality (Marks, 2002: 2-3). It made sense to begin our study into eye tracking by thinking about these different assumptions regarding content and context and formulating methods to analyse them empirically.

For our first project we chose to focus on an assumption regarding spectatorship that is more straightforward and essential than any listed above: namely that viewers can follow a story told only in images. This is an assumption that underpins the ubiquitous presence of the montage sequence in narrative filmmaking, where a large amount of story information is presented in a short, dialogue-free sequence. We hypothesized that by tracking a montage sequence we would be able to ascertain if and how viewers looked at narrative cues, even when these are not the most salient (i.e., large, colourful, moving) features in the scene. The study was in this way designed to start investigating how much film directors and designers can control subjects’ gaze behaviour and top-down (cognitively driven) processes.

The sequence from Up! was chosen in part to act as a ‘control’ against which we could later assess different types of content. The story told in the 4-minute sequence is complex but unambiguous, with its events and emotive power linked by clear relationships of cause and effect. It is in this way a prime example of a classical narrative style of filmmaking, where the emphasis is on communicating story information as transparently as possible (Bordwell, 1985: 160). Our hypothesis was that subjects’ gaze behaviour would be controlled by the tightly directed sequence with its strong narrative cues, and that this study could thereby function as a benchmark against which different types of less story-driven material could be compared later.

Researcher C: A colleague and I set up the Eye Tracking and the Moving Image (ETMI) research group in 2012, following discussions around how evidence was collected to support and investigate current film theory. These conversations grew into a determination to begin a cross-disciplinary research group, initially in Melbourne, to begin working together on these ideas. I had previously been involved in research using eye tracking to study other dynamic stimuli such as decision making processes in sport and the dynamics of signature forgery and detection, and my experience led to a belief that the eye tracker could have enormous potential as a research tool in the analysis and understanding of the moving image. Work on this particular study was inspired by the early aims of a subgroup (of which the other authors are a part), whose members were interested to investigate, in a more objective manner, the effect that narrative cues had on viewer gaze behaviour.

Existing research in our disciplines, and how that influenced our approaches to the study

Researcher A: While there had been research already conducted on eye tracking and the moving image, none of it had focussed on the creational aspects of screen texts: what goes into making a moving image text, before it becomes a finished product to be analysed. Much like screen scholarship that studies in a ‘post event’ way, what was lacking – usefully for us – was input from those who are practitioners themselves. The wider Melbourne-based Eye Tracking and the Moving Image research group within which this study sits has a membership that includes other practitioners, including a sound designer and a filmmaker. Combined, this suggested that our approach might offer something different; that it might ‘do more’ and hopefully speak to the industry as well as other researchers. As a screenwriter, the opportunity to co-research with scholars, scientists and other creative practitioners was therefore not only appealing, but also methodologically important.

As already highlighted, it was both an academic and a practical interest in the intersection of plot, character and theme that underpinned my approach. As Smith has argued, valuing character in screen studies has not always been possible (1995); moving this forward, valuing character, and in particular the character’s journey, has recently become more salient (see Batty, 2011; Marks, 2009), adding weight to a creative practice approach to screen scholarship. In this way, understanding the viewer’s experience of the screen seemed to lend itself well to some of the core concerns of the screenwriter; or to put it another way, had the ability to test what we ‘know’ about creative practice, and the role of the practitioner. Feeding, then, into wider debates about the place of screenwriting in the academy (see Baker, 2013; Price, 2013; 2010), it was important to value the work of the screenwriter, and in a scholarly rigorous – and hopefully innovative – way.

Researcher B: The majority of research on eye tracking and the moving image to date has been designed and undertaken as an extension to cognitive theories of film comprehension. Deriving from the constructivist school of cognitive psychology, and led by film theorist David Bordwell, this approach argues that viewers do not simply absorb but construct the meaning of a film from the data that is presented on screen. This data does not constitute a complete narrative but a series of cues that viewers process by generating inferences and hypotheses (Elsaesser and Buckland, 2002: 170). Bordwell’s approach explicitly opposes psychoanalytic film theory by attending to perceptual and cognitive aspects of film viewing rather than unconscious processes. Psychologist Tim Smith has mobilized eye tracking in connection with Bordwell’s work to demonstrate how this empirical method can “prove” cognitive theories of comprehension—showing that subjects’ eyes do fixate on those cues in a film’s mise-en-scène that the director has controlled through strategies of staging and movement (Smith, 2011; 2013).

The Up study was designed to follow in the wake of Smith’s work, with a particular interest in examining the premise of Bordwell’s theory – which is that narration is the central process that influences the way spectators understand a narrative film (Elsaesser and Buckland, 2002: 170). With this in mind, we deliberately chose a segment from an animated film, where the tightly directed narrative of the montage sequence is competing with a variety of other stimuli that subjects’ eyes could plausibly be attracted to: salient colourful and visibly designed details in the background and landscape of each shot.

We were also interested in this montage sequence for the highly affecting nature of its mini storyline, which establishes the protagonist Carl’s deep love for his wife Ellie as the motivation for his journey in Up! itself. The sequence carries a great deal of emotive power by contrasting the couple’s happiness in their long marriage with Carl’s ultimate sadness and regret at not being able to fulfill their life-long dream of moving to South America before Ellie falls sick and dies. Would it be possible to ‘see’ this emotional impact in viewers’ gaze behaviour?

How we reacted to the initial data, and what it was telling us.

Researcher A: When looking at data for the first time, I certainly saw a correlation between what we know about screenwriting and seeing, and what we could now turn to as evidence. For example, key objects such as the adventure book, the savings jar (see Fig. 1) and the picture of Paradise Falls – all of which recurred throughout the montage sequence – were looked at by viewers intensely, suggesting that narrative meaning was ‘achieved’.

Fig. 1. A heat map showing the collective intensity of viewers’ responses to the savings jar.

Fig. 1. A heat map showing the collective intensity of viewers’ responses to the savings jar.

As another example, when characters were purposely (from a screenwriting perspective) separated within the frame of the action, viewers oscillated between the two, eventually settling on the one they believed to possess the most narrative meaning (see Fig. 2). This further implied the importance of the character journey and its associated sense of theme, which for screenwriting verifies the careful work that has gone into a screenplay to set up narrative expectations.

Fig. 2. A gaze plot showing the fixations and saccades of one viewer in a scene with the prominent faces of Carl and Ellie.

Fig. 2. A gaze plot showing the fixations and saccades of one viewer in a scene with the prominent faces of Carl and Ellie.

Researcher B: We chose to analyse the data on Up! by examining how viewer attention fluctuated in focus between Carl and Ellie across the course of the montage sequence. The two are equal agents in the narrative at the beginning, but the montage’s story unfolds through the action and behaviour of each as it continues – that is, each character carries the story at different points. Overwhelmingly, the data supported this narrative pattern by showing that the majority of viewers fixated on the character who, moment by moment, functions as the agent of the story, even when that figure is not the most salient aspect of the image. Aligning with Bordwell’s cognitive theory of comprehension, this data confirms that viewers do rely principally on narrative cues to understand a film. As a top-down process of cognition, narrative exerts control over viewer attention to keep focus on the story rather than let the gaze wander to other bottom-up (salient) details in the mise-en-scène. It is this process that allowed Smith to show that viewers overwhelmingly will not notice glaring continuity errors on screen (Smith, 2005). As in the famous ‘Gorillas in our Midst’ experiment (Simons and Chabris, 1999), viewer attention is focused so closely on employing narrative schema to spatially, temporally and causally linked events that the salient stimuli on screen appears to be completely missed.

Researcher C: Initially I was quite interested to see the attention paid to faces, and in particular, characters’ eyes and mouths. Being animation, I had been keen to see if similar elements of faces would draw viewers’ eyes in the same ways that we look at human faces, where eyes and mouths are most viewed (Crouzet, et al., 2010). Here, even though the characters were not engaging in dialogue, their mouths as well as their eyes were still searched. Looking at eyes has been linked to looking for contextual emotional information (Guastella et al., 2007), and so with this montage sequence being non-verbal, it was not surprising to see much of the focus on characters’ eyes as viewers attempted to read the emotion though them (see Fig. 3).

Fig, 3. Two viewers’ gaze plots depicting the sequence of fixations made between Carl and Ellie.

Fig, 3. Two viewers’ gaze plots depicting the sequence of fixations made between Carl and Ellie.

Other areas I was interested to observe were instances when other well-known features drew strong viewer attention, such as written text and bright (salient) objects. Two particular scenes we examined contained examples of these. In one scene, in which the savings jar sits at the back of a dark bookshelf, viewers were both drawn to look at the bright candle in the foreground and also to the savings jar. The jar was in the dark, however with narrative cues to draw attention to it as well as the fact that it contained text, viewers were drawn to look at it (see Fig. 1). Surprisingly, in this scene other interesting objects are easily discernible – a wooden colourful bird figure; a guitar; a compass – yet the savings jar as well as the bright candles were viewed. The contextual information, the text and the salience appear to be working here to drive the eye, all within a few seconds of time.

Fig. 4. Gaze plots of fixations made by all viewers over the scene in which Carl purchases airline tickets.

Fig. 4. Gaze plots of fixations made by all viewers over the scene in which Carl purchases airline tickets.

The second scene to see text working as a cue for the eye was in the travel shop scene (Fig. 4). Here, viewers were drawn to look at two text-based posters placed on the back wall of the shop. Again, this scene was only shown momentarily, yet glances towards the text and images, as well as the exchange between the characters, give viewers the elements of the story they need to glean so that they know what is going on, and where the story will go next (Carl’s surprise for Ellie).

How over time we better understood the data, and what we began to know more

Researcher A: I was interested to see that some viewers spent time looking at the periphery. The Up! montage sequence did not necessarily offer ‘alternative’ layers in the margins of the screen, though given its created and controlled animated nature, it perhaps should not be a surprise that away from the centre of the screen there were visual delights, such as the sun setting over the city and a blanket of clouds that changed shape, from clouds to animals to babies. This suggested to me that in animation, because viewers know that images have been created from scratch, there is an expectation that the screen will offer a plethora of experiences, from narrative agency to visual amplification. This, in turn, suggested that in further studies, it might be useful to contrast texts that use the potential of the full screen to engage viewers with those that go in close and privilege the centre. Genre would most likely play a key role in this future endeavour.

Researcher B: As hoped, this pilot study has been instructive as a base from which we can now expand. It has raised many questions. One issue is that this data cannot ‘prove’ subjects were not seeing those elements on-screen that were not fixated upon – were they perhaps seeing them peripherally? This could only be confirmed by conducting interviews after the eye tracking takes place, and could instructively inform an understanding of how story information that is layered in the mise-en-scène (for instance in setting, lighting and costume) contributes to overall narrative comprehension. We are also very interested to determine how the context of viewing affects gaze behaviour. For instance, would subjects still fixate overwhelmingly on narrative cues when watching this sequence in a cinema environment on a large – even an IMAX – screen? In this environment the image on screen is larger and the texture more palpable. Would viewers here perhaps be more focused on these salient pleasures of the image and engage in a different, less cognitive experience of the film; letting their eyes roam across the grain of the shot in its colours, shapes and surfaces? Would results alter between an animated and live action film? Psychoanalytic film theory tells us that the cinematic apparatus promotes identification with characters and, by extension, the ideologies of the social system from which they are produced (Mulvey, 1975). Eye tracking can potentially intervene in this powerful theory of spectatorship by showing if and how viewers do fixate on the cues that give rise to this interpellation.

Researcher C: After looking at some of early scene analyses, I was somewhat surprised by how many eye movements could be made in fleetingly fast scenes, and at how many items in these scenes one could fixate on, if only briefly. I had expected viewers to be taking in some of the surrounding items in a scene using their peripheral vision, and to see more of the centralisation bias (Tatler and Vincent, 2009). Yet for some scenes, in particular for the two scenes in which Carl purchases the surprise airline tickets (see Figs 4 and 5), we see how viewers were drawn to search for narrative clues by looking around the scene.

Fig. 5. Gaze plot showing the fixations made by all viewers as they briefly see the contents of the picnic basket.

Fig. 5. Gaze plot showing the fixations made by all viewers as they briefly see the contents of the picnic basket.

In the first scene (see Fig. 4), Carl in seen in a shop, facing the shop assistant. Viewers had previously seen him in the midst of coming up with a bright idea. This scene thus gives the viewer a chance to work out what his idea was. What can be seen is that most viewers scanned the surrounds for clues. A similar pattern is seen in the next scene, in which we quickly glance at the contents of a picnic basket being carried by Carl (see Fig. 5). In the basket, which is seen close up, viewers scan the basket’s contents. It contains picnic items and the surprise airline ticket, and even though some glances went to other basket items, it was the ticket that captured most of the attention; the item that held the most narrative information. This item was also the most salient, being the clearest and brightest item in the basket, and, importantly, the only item to contain written text. In a very short glimpse of a scene, these features almost ensured that viewers’ eyes were directed to look at and acknowledge the ticket.

What excites us about the future of work in this area, and where we think it might take our own disciplines

Researcher A: If we are to fully embrace the creative practice potential of studies such as this, then we might look to creating new texts that can then be studied. If, in 1971, Norton and Stark created simple drawings to test how their subjects recognised and learned patterns, then over 40 years later, our approach might be to develop a short moving image narrative through which we can test our viewers’ gaze. For example, if we were to develop a short film and play it out of sequence (i.e., narrative meaning altered), might we affect where viewers look? Might they look differently: in different places and for different lengths of time? Similarly, what if we were to musically score a text in different ways, diegetically and non-diegetically? Might we affect the focus of viewer gaze? If so, what might this tell us about narrative attention and filmmaking techniques that sit ‘beyond the screenplay’?

For screenwriting as a discipline, studies such as these would serve two purposes, I feel. Firstly, they would help to strengthen the presence of screenwriting in the academy, especially in regard to innovative research that privileges the role of the practitioner. Accordingly, these studies could provide a variety of methodological approaches that might be of use to other screenwriting scholars; or that might be applied to other creative practice disciplines, in which researchers wish to understand the work that has gone into the creation of a text that might otherwise only be studied once it has been completed. Secondly, and perhaps more importantly, such studies might yield results that benefit, or at least inform, future screenwriting practices. Whether industry-related practices or otherwise, just like all ‘good’ creative practice research, the insights and understandings gained would contribute to the discipline in question in the form of ‘better’ or ‘different’ ways of doing (Harper, 2007). For me, this would reflect both the nature and the value of creative practice research.

Researcher B: All of the potential avenues for future research in this field take an essential interest in how moving images on screen produce a play between top-down and bottom-up cognition. In this, a larger issue for me – going back to the points I raised at the beginning of my section – is how the data can be mobilized beyond a strictly cognitive framework and vocabulary of screen theory. As indicated, the cognitive approach offers a deliberately ‘common sense’ counterpart to a paradigm such as psychoanalysis, with its reliance on myth, desire and fantasy (Elsaesser and Buckland, 2002: 169). Cognitive theory understands a film as a data set that a viewer’s brain processes and completes in an active construction of meaning – an understanding that eye tracking and neurocinematics is very well placed to support and expand. But most screen scholars appreciate and theorize film and television texts as much more than mere sets of data. The moving image is an experience that only ‘works’ by generating emotional affect, by engaging the viewer’s attachments, memories, desires and fears. Film theorist Linda Williams proposes that our investment in following the twists and turns of a narrative is fundamentally reliant upon the emotion of pathos: we continually, pleasurably invest in the expectation that a character will act or be acted upon in such a way that they achieve their goal, and continually, pleasurably have that expectation obscured and dashed by the story (Williams, 1998). So viewer attention is driven not just by a drive to know but also by a desire to feel: to be swept up in waves of hope and disappointment.

The mini storyline of the Up! montage sequence relies entirely on this dialectic of action and pathos. Carl and Ellie’s hopes are repeatedly frustrated, and Carl is finally unable to redeem this pattern before Ellie dies – producing a profound sense of pathos and regret as the defining theme of the sequence. We can see that our subjects’ fixations fell in line with this pattern as the sequence unfolded, consistently focusing on the character who was triggering or carrying the emotional power. But how do we distinguish the ‘felt’ dimension of this gaze out from the viewer’s efforts to simply comprehend what is happening by following characters’ movements, facial expressions or body language? How, that is, can we ‘see’ emotional engagement, and start to appreciate how this crucial dimension of spectatorship – based on feeling not thinking – governs the play between top-down and bottom-up cognition in moving pictures? For me, grappling with this problem – and perhaps experimenting with further measurements of pupil dilation, heart rate and brain activity – offers a fascinating pathway into understanding how eye tracking can move beyond an engagement with cognitive film theory to contribute to phenomenological thinking on genuinely embodied seeing and experience.

Researcher C: There is so much that can be done in this area, and that makes it an exciting pursuit; yet what makes it even more motivating is the way that we hope to go about it: collaboratively. One of the core aspects that members of ETMI are very passionate about is working together, bringing in different fields, different disciplines, different ways of seeing things, and building bridges between them. This work is not only about learning more about how we watch and interact with films, but also about having different perspectives on those insights. Work I would personally like to see undertaken in this way is to explore how black and white viewing compares to colourised viewing, and to explore whether and how 3D viewing affects how we gaze about a scene. To compare the gaze and emotional responses of children and adults to the same visual content, and similarly compare visual and emotional responses to material between males and females, and between genre fans and haters, is also an interesting possibility.

Finally, adding to these, I am excited about the potential collection and analysis of other physiological measures to better gauge emotional engagement. These include blood pressure, pupillometry, skin conduction, breathing rate and volumes, heart rate, sounds made (gasps, holding breath, sighs etc.) and facial expressions made.


By reflecting on each of our research backgrounds, experiences and expectations, what this article has revealed is that while we might have all come to the study with varied approaches and intentions, we have come out of the study with a somewhat surprisingly harmonious set of observations and conclusions. Without knowing it, perhaps, we were all interested in narrative and the role that characters play in the agency of it. We were also similarly interested in landscape and the visual potential of the screen; not in an obvious way, but in relation to subtext, meaning and emotion. The value of a study like this, then, lies not just in its methodological originality, but also in its ability to stir up passions in cross-disciplinary researchers, whereby each can bring to the table their own skills and ways of understanding data to reach mutual and respective conclusions. Although we ‘knew’ this from undertaking the study, the opportunity to reflect fully on the process in the form of an article has given us an even greater understanding of the collaborative potential of cross-disciplinary researchers such as ourselves.



Aristotle. (1996). Poetics. Trans. Malcolm Heath. London: Penguin.

Baker, Dallas. (2013). Scriptwriting as Creative Writing Research: A Preface. In: Dallas Baker and Debra Beattie (eds.) TEXT: Journal of Writing and Writing Courses, Special Issue 19: Scriptwriting as Creative Writing Research, pp. 1-8.

Batty, Craig, Adrian G. Dyer, Claire Perkins and Jodi Sita. (Forthcoming). Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative. In: CarrieLynn D. Reinhard and Christopher J. Olson (eds.). Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship. New York: Bloomsbury.

Batty, Craig. (2013) Creative Interventions in Screenwriting: Embracing Theme to Unify and Improve the Collaborative Development Process. In: Shane Strange and Kay Rozynski. (eds.) The Creative Manoeuvres: Making, Saying, Being Papers – the Refereed Proceedings of the 18th Conference of the Australasian Association of Writing Programs, pp. 1-12.

Batty, Craig. (2011). Movies That Move Us: Screenwriting and the Power of the Protagonist’s Journey. Basingstoke: Palgrave Macmillan.

Batty, Craig and Zara Waldeback. (2008). Writing for the Screen: Creative and Critical Approaches. Basingstoke: Palgrave Macmillan

Bordwell, David. (1985). Narration in the Fiction Film. London: Routledge.

Buswell Guy. T. (1935). How People Look at Pictures. Chicago: Chicago University Press.

Cerf, Moran, E. Paxon Frady and Christof Koch. (2009). Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision, 9(12): 10, pp. 1–15.

Crouzet, Sebastien M., Holle Kirchner and Simon J. Thorpe. (2010). Fast saccades toward faces: Face detection in just 100 ms. Journal of Vision, 10(4): 16, pp. 1–17.

Egri, Lajos. (2004). The Art of Dramatic Writing. New York: Simon & Schuster.

Elsaesser, Thomas and Warren Buckland. (2002). Studying Contemporary American Film: A Guide to Movie Analysis. London: Hodder Headline.

Guastella, Adam J., Philip B. Mitchell and Mark R Dadds. (2008). Oxytocin increases gaze to the eye region of human faces. Biological Psychiatry, 63, pp. 3-5.

Harper, Graeme and Jonathan Rayner. (2010). Cinema and Landscape. Bristol: Intellect.

Harper, Graeme. (2007). Creative Writing Research Today. Writing in Education, 43, p. 64-66.

Hayhoe, Mary. (2000). Vision using routines: A functional account of vision. Visual Cognition, 7, pp. 43–64.

Hockley, Luke. (2007). Frames of Mind: A Post-Jungian Look at Cinema, Television and Technology. Bristol: Intellect.

Land, Michael F., Neil Mennie and Jennifer Rusted. (1999). The roles of vision and eye movements in the control of activities of daily living. Perception, 28, pp. 1311–1328.

Marks, Dara. (2009). Inside Story: The Power of the Transformational Arc. London: A&C Black

Marks, Laura U. (2002). Touch: Sensuous Theory and Multisensory Media.

Minneapolis: University of Minnesota Press.

Mulvey, Laura. (1975). Visual Pleasure and Narrative Cinema. Screen, 16(3), pp. 6-18.

Norton, David, and Lawrence Stark. (1971). Scanpaths in eye movements during pattern perception. Science, 171, pp. 308–311.

Price, Steven. (2013). A History of the Screenplay. Basingstoke: Palgrave Macmillan.

Price, Steven. (2010). The Screenplay: Authorship, Theory and Criticism. Basingstoke: Palgrave Macmillan.

Rodowick, David. (2007). The Virtual Life of Film. Cambridge, MA: Harvard University Press.

Schnotz, Wolfgang and Richard K. Lowe. (2008). A unified view of learning from animated and static graphics. In: Richard K. Lowe and Wolfgang Schnotz (eds.). Learning with animation: Research implications for design. New York: Cambridge University Press, pp. 304-356.

Schütz, Alexander C., Doris I. Braun and Karl R. Gegenfurtner. (2011). Eye movements and perception: A selective review. Journal of Vision, 11(5), pp. 9, 1–30.

Simons, Daniel J. and Christopher F. Chabris. (1999). Gorillas in our Midst: Sustained Inattentional Blindness for Dynamic Events. Perception, 28, pp. 1059-1074.

Smith, Murray (1995). Engaging Characters: Fiction, Emotion, and the Cinema. Oxford: Oxford University Press.

Smith, Tim J. (2005). An Attentional Theory of Continuity Editing. [accessed October 17, 2014].

Smith, Tim J. (2011). Watching You Watch There Will Be Blood. [accessed August 22, 2014].

Smith, Tim J. (2013). Watching you watch movies: Using eye tracking to inform cognitive film theory. In: A. P. Shimamura (ed.). Psychocinematics: Exploring Cognition at the Movies. New York: Oxford University Press, pp. 165-191.

Sobchack, Vivian (1992). The Address of the Eye: A Phenomenology of Film Experience. Princeton, N.J: Princeton University Press.

Stadler, Jane (2010). Landscape and Location in Australian Cinema. Metro, 165.

Tatler, Benjamin W., and Benjamin T. Vincent. (2009). The prominence of behavioural biases in eye guidance. Visual Cognition, 17, pp. 1029–1054.

Tobii Technology (2005). User Manual. Tobii Technology AB. Danderyd, Sweden.

Vogler, Christopher (2007). The Writer’s Journey: Mythic Structure for Writers. Studio City, CA: Michael Wiese Productions.

Wells, Paul (2010). Boards, Beats, Binaries and Bricolage – Approaches to the Animation Script. In: Jill Nelmes (ed.) Analysing the Screenplay, Abingdon: Routledge, pp. 104-120.

Wells, Paul (2007) Basics Animation 01: Scriptwriting. Worthing: AVA Publishing.

Williams, Linda (1998). Melodrama Revised. In: Nick Browne (ed.). Refiguring American Film Genres: History and Theory. Berkeley, CA: University of California Press.

Yarbus, Alfred L. (1967). Eye Movements and Vision. New York: Plenum.


List of figures

Fig. 1. A heat map showing the collective intensity of viewers’ responses to the savings jar. Source: author study.

Fig. 2. A gaze plot showing the fixations and saccades of one viewer in a scene with the prominent faces of Carl and Ellie. Source: author study.

Fig, 3. Two viewers’ gaze plots depicting the sequence of fixations made between Carl and Ellie. Source: author study.

Fig. 4. Gaze plots of fixations made by all viewers over the scene in which Carl purchases airline tickets. Source: author study.

Fig. 5. Gaze plot showing the fixations made by all viewers as they briefly see the contents of the picnic basket. Source: author study.



[i] A full analysis of this study, ‘Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative’, will appear in the forthcoming collection Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship, edited by CarrieLynn D. Reinhard and Christopher J. Olson.

[ii] See Batty, Craig, Dyer, Adrian G., Perkins, Claire and Sita, Jodi (forthcoming) for full results.



Associate Professor Craig Batty is Creative Practice Research Leader in the School of Media and Communication, RMIT University, where he also teaches screenwriting. He is author, co-author and editor of eight books, including Screenwriters and Screenwriting: Putting Practice into Context (2014), The Creative Screenwriter: Exercises to Expand Your Craft (2012) and Movies That Move Us: Screenwriting and the Power of the Protagonist’s Journey (2011). Craig is also a screenwriter and script editor, with experiences across short film, feature film, television and online drama.

Dr Claire Perkins is Lecturer in Film and Screen Studies in the School of Media, Film and Journalism at Monash University. She is the author of American Smart Cinema (2012) and co-editor of collections including B is for Bad Cinema: Aesthetics, Politics and Cultural Value (2014) and US Independent Film After 1989: Possible Films (forthcoming, 2015). Her writing has also appeared in journals including Camera Obscura, Critical Studies in Television, Celebrity Studies and The Velvet Light Trap.

Dr Jodi Sita is Senior Lecturer in the School of Allied Health at the Australian Catholic University. She works within the areas of neuroscience and anatomy, with expertise in eye tracking research. She has extensive experience with multiple project types using eye tracking technologies and other biophysical data. As well as her current research using into viewer gaze patterns while watching moving images, she is using eye tracking to examine expertise in Australian Rules Football League coaches and players, and to examine the signature forgery process.

Movement, Attention and Movies: the Possibilities and Limitations of Eye Tracking? – Adrian G. Dyer & Sarah Pink


Movies often present a rich encapsulation of the diversity of complex visual information and other sensory qualities and affordances that are part of the worlds we inhabit. Yet we still know little about either the physiological or experiential elements of the ways in which people view movies. In this article we bring together two approaches that have not commonly been employed in audience studies, to suggest ways in which to produce novel insights into viewer attention: through the measurement of observer eye movements whilst watching movies; in combination with an anthropological approach to understanding vision as a situated practice. We thus discuss how both eye movement studies that investigate complex media such as movies need to consider some of the important principles that have been developed for sensory ethnography, and in turn how ethnographic and social research can gain important insights into aspects of human engagement from the emergence of new technologies that can better map how an understanding of the world is constructed through sensory perceptual input. We consider recent evidence that top down mediated effects like narrative do promote significant changes in how people attend to different aspects of a film, and thus how film media combined with eye tracking and ethnography may reveal much about how people build understandings of the world.


Seeing in complex environments is not a trivial task. Whilst people are often under the impression that you can believe what you see (Levin et al. 2000), physiological and neural constraints on how our visual system operates means that only a very small proportion of an overall visual scene might be reliably perceived at one point in time during the evaluation of a sequence of events. Evidence for the way in which we often only perceive a portion of the vast amount of visual information present in a scene is nicely illustrated in the ‘Gorillas in our Midst’ short (25s) motion sequence where a group of six participants (three dressed respectively in white or black teams) are filmed passing a basketball between team members (Simons and Chabris 1999). Subjects observing the film sequence are required to count the number of passes between the three students dressed in white, and whilst many subjects do correctly count the number of passes, the majority of test subjects fail to observe a large gorilla (an actor dressed as a gorilla) that walks into the middle of the visual field and beats it’s chest, before walking casually out of the scene. People typically don’t see this salient gorilla in the action sequence because their attention has been directed to the basketball catching team in white with the instruction of counting the number of passes. Why do we miss such a salient object as a gorilla, and what does this mean for our understanding of how different subjects might view complex information in real life, or in presentations that encapsulate aspects of real life, such as movies?

In this article we take an interdisciplinary approach to the question of how we might see certain things in complex dynamic environments. We draw together insights from the neurosciences and eye tracking studies, with anthropological understandings of vision and audio-visual media in order to map out an approach to audience research that accounts for the relationship between human perception, vision as a form of practical activity, and the environments through which these are co-constituted. We first build a brief outline of how the eye, visual perception and the subjectivity or selectivity of viewing are currently understood from the perspective of vision sciences. This demonstrates how physiologically there is evidence that the eye sees selectively, yet it does not fully explain why or how perceptual understanding might vary across different persons, or for the same person across different contexts. We then build on this understanding with a discussion of what we may learn from eye tracking studies with moving images. As we will show, eye tracking can offer detailed measurements of how the eye attends to specific instances, movements, and points within sequences of action. This can reveal patterns of attention across a sample of participants, towards specific types of action. Yet eye tracking is limited in that while it can tell us what participants’ eyes are attending to, it cannot easily tell us why, what they are experiencing, what their affective states are, nor how their actions are shaped by the wider social, material, sensory and atmospheric environments of which they are part. Therefore in the subsequent section we turn to phenomenological anthropology, and draw on the possibilities provided by the theoretical-ethnographic dialogue that is at the core of anthropological research, to suggest how the propositions of eye tracking studies might be situated in relation to the ongoingness and movement of complex environments.

We discuss that such an interdisciplinary approach, which brings together monitoring and measurement with qualitative and experiential research, is needed to generate understandings of not only what people view but of how these viewing practices and experiences become relevant as part of the ways in which they both perceive and participate in the making of everyday worlds. However we end the article with a note about the relative complexities of working across disciplines, and in particular between those that measure and those that use empathetic and collaborative modes of understanding and knowing, which can be theorised as part of the ways film is experienced (Bordwell and Thompson 2010; Pink 2013). For a review of how these issues may relate to broader issues about film culture, eye tracking and the moving image readers are also referred to the manuscripts in this special issue by Redmond and Batty (2015), and Smith (2015).

Visual Resolution, Perception and the Human Eye

To enable visual perception the human eye has cone photoreceptors distributed across the retina which enable wide field binocular visual perception of about 180 degrees (Leigh and Zee 2006). In the central fovea region of the eye, cone photoreceptors are much more densely packed, and our resulting high acuity vision is only about 2-3 degrees of visual angle (Leigh and Zee 2006). Visual angle is a convenient way understand the relationship between the actual size of an object and viewing distance, for example, our fovea acuity is approximately equivalent to the width of our thumb held at about 57 cm (at this distance 1cm represents 1 degree of visual angle). This means that to view visual information in detail it is often necessary to direct the gaze of our eyes to different parts of a scene, and this is typically done with either ballistic eye movements termed saccades, or much slower smooth pursuit eye movements like when we follow the movement of a slow object in the distance (Martinez-Conde et al. 2004). Saccades are commonly broken down into two main types that are of high value for interpreting how viewers might perceive their environment, including reflexive saccades mainly thought to be driven by image salience (also termed exogenous control), or volitional saccades (endogenous control) where a viewers’ internal decision making directs attention through top-down mechanisms to where the gaze should be attended within a scene or movie sequence (Martinez-Conde et al. 2004; Parkhurst et al. 2002; Tatler et al. 2014; Pashler 1998; Smith 2013). Thus eye movements can be, in very broad terms, described as ‘bottom up’ processing when the eye makes reflexive saccades to salient stimuli within a scene, or ‘top down’ when a viewer uses their volitional control to direct where the eye should look, and both types of saccade are important for understanding how we interacted with complex scenes in everyday life. For example, on entering a café we might casually gaze at the wonderful variety of cakes with reflective saccades to all the highly colourful icings; but when a friend says to ‘try the chocolate cake’ we direct our eyes only to cakes of chocolate brown colour using volitional saccades. Interestingly, these different types of saccadic eye movements are likely to involve different cortical processing of information (Martinez-Conde et al. 2004), potentially allowing for complex multi modal processing that incorporates the rich and dynamic environment experienced when viewing a movie. It is likely that both these mechanisms operate whilst subjects view a film, and the extent to which mechanism dominates during a particular film sequence may depend upon factors like visual design, narrative, audio input and cinema graphic style, as well as individual experience or demographic profile of observers.

The fact that we typically only perceive the world in low resolution at any one point in time can be easily illustrated with an eye chart in which letters of different parts of our visual field are scaled to make the letters equally legible when a subject fixates their gaze on a central fixation spot, or simulated by selectively Gaussian blurring a photograph such that it matches how we see detail at any one point in time (Figure 1). Human subjects typically shift their gaze about 3 times a second in many real world type scenarios in order to build up a detailed representation of our visual environment (Martinez-Conde et al. 2004; Tatler et al. 2014; Yarbus 1967). To efficiently direct the fovea to different parts of a visual scene, the human eye usually makes saccades, which also require a shift of the observer’s attention (Kustov et al. 1996; Martinez-Conde et al. 2004). One way to record subject gaze is to use a video-based eye tracking system that makes use of the different reflective properties of the eye to infrared radiation (Duchowski 2003), using a wavelength of radiation that is both invisible to the test subject and does not damage the eye. This non invasive technique thus enables very natural behavioural responses to be collected from a wide range of subjects. When the eye is illuminated by infrared light, which is typically provided by the eye tracking equipment, it enters the lens and is strongly reflected back by the retina providing a high contrast signal for an infrared camera to record, whilst some of the carefully placed infrared lights also reflect off the cornea of the eye which provides a constant references signal to enable eye tracker software to disentangle minor head movements from the actual eye movements of a subject. A subject is first calibrated to grid stimulus of known spatial dimensions (Dyer et al. 2006), and then when test images are viewed it is possible to accurately quantify the different regions of a scene to which the subject pays attention, the sequence order off this attention, and thus also what features of a scene may escape the direct visual attention of a viewer (Duchowski 2003). The use of this non invasive technique then directly enables the measurement of subject attention to the different components of a stimulus (Figure 2), and has been extensively employed for static images for many fields including medicine, forensics, face processing, advertising, sport and perceptual learning (Dyer et al. 2006; Horsely 2014; Russo et al. 2003; Tatler 2014; Vassallo et al. 2009; Yarbus 1967).

Figure 1. The way our eye samples the world means that only the central fovea region is viewed in detail. The left had image shows letters scaled to equal legibility when a subject fixates gaze to the central dot, and the right hand image is a photographic reconstruction of how an eye would typically resolve detail of the Sydney Harbour Bridge at one point in time.

Figure 1. The way our eye samples the world means that only the central fovea region is viewed in detail. The left had image shows letters scaled to equal legibility when a subject fixates gaze to the central dot, and the right hand image is a photographic reconstruction of how an eye would typically resolve detail of the Sydney Harbour Bridge at one point in time.

In recent times there has been a growing appreciation that to understand how the human visual system and brain processes complex information, the use of moving images has significant advantages since these stimuli may more accurately represent the very complex and dynamic visual environments in which we typically operate (Tatler et al. 2011). For example, when the eyes of a subject are tracked whilst driving a car, it can be observed that the gaze of subjects tends to be directed ahead of the responding action that a driver will take (Land and Lee 1994), and in other real life activities like making a cup of tea test subjects also tend to fixate on particular objects before an action like picking up an object (Land et al. 1999). This shows visual processing is often dynamic and may be influenced by top down volitional goals of a subject, whilst static images may not always best represent how subjects’ actions are informed by visual input in a dynamic situation (Tatler 2014). Interestingly, the capacity of subjects at visually anticipating tasks may link to performance or experience at a given action, as elite cricket batsmen viewing action can more efficiently predict the location that a ball will bounce in advance of the event, providing significant advantages for facing fast bowling where decisions must be made very quickly and accurately (Land and McLeod 2000). Thus there is evidence that visual perception and eye movements for moving images may be influenced by top down mechanisms and experience, as well as bottom up salience driven mechanisms of visual processing (Tatler 2014).

Subject viewer gaze and attention in dynamic environments can also be significantly influenced by the actions of other people who may be viewed within a scene. For example, when viewing a simple magic trick where an experienced magician in a video waves a hand to make an object disappear, the gaze direction of subjects viewing the video is heavily influenced by the actual gaze direction of the magician in the video clip (Tatler and Kuhn 2007). If the magician appears to pay attention to his waving hand then subjects follow this misdirection of viewer attention and the magi trick, performed with the other hand, cannot be detected and the magic trick is successful. However; this pattern is changed if the magician’s gaze attends the hand performing the apparent magical act, and then trick is readily detected by observes. This simple but highly effective demonstration shows how viewer experience is not only driven by reflexive bottom up salience signals present in complex images, but several top down and/or contextual factors may influence visual behaviour. The effect of dynamic complex environments affecting subject eye movements has also been observed in demonstrations of how people might encounter each other and either divert or attend their gaze depending upon prior experience, the perception of threat and/or chance of a collision (Jovancevic-Misic and Hayhoe 2009). Other evidence of top down type influences on observer gaze behaviour come from our understanding of how instructions or narrative may influence where a subject looks (Land and Tatler 2009; Tatler 2014; Yarbus 1967). For example, in the classic eye movement experiments done by Yarbus (1967), in which he presented to test subjects static images, a variety of different instructions were provided for viewing the painting ‘The Unexpected Visitor’ by Ilya Repin. These instructions included estimating the material circumstances of subjects within the painting, or the age of the subjects, the subject’s clothing; and a very different set of saccades and fixations was observed for different instructions or a free view situation that might be taken as a condition mainly driven by bottom up salience factors on perception; showing that top down view goals strongly influenced the way in which eye gaze is directed (Tatler 2014; Yarbus 1967).

Eye Tracking For Understanding Dynamic and Complex Visual Information

Whilst these clever, and comparatively complex, evaluations of visual perception are currently teaching us a lot about human visual performance and viewer experience, the current rapid advances in computer technology and eye tracking are now starting to enable the testing of how subjects view very complex dynamic environments as encapsulated in movies (Mital et al. 2011; Smith and Henderson 2008; Smith and Mital 2013; Smith et al. 2012; Treuting 2006; Vig et al. 2009). This potentially allows for new insights into increasingly real world type viewer experience, how the visual system potentially processes very complex information, and how viewers from different demographics may interpret information content in films. For example, some recent work has looked at subject viewer attention within movies and observed high levels of attention to faces (Treuting 2006), revealing consistent behaviours to previous work that used static images (Vassallo et al. 2009; Yarbus 1967), but a wealth of opportunities are becoming available for better understanding real world visual processing.

Figure 2. When we view an image our eyes often fixate on key areas on interest for short periods of about a third of a second, and then the eyes may make ballistic shifts (saccades) to other features. When a typical subject viewed sequential images from the film 'UP', fixations (green circles) mainly centred on the respective faces of main characters, whilst lines between fixations show the direction of respective saccades [image from Craig Batty, Adrian Dyer, Claire Perkins and Jodi Sita. 2014 Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative ( Bloomsbury, 2015) with permission].

Figure 2. When we view an image our eyes often fixate on key areas on interest for short periods of about a third of a second, and then the eyes may make ballistic shifts (saccades) to other features. When a typical subject viewed sequential images from the film ‘UP’, fixations (green circles) mainly centred on the respective faces of main characters, whilst lines between fixations show the direction of respective saccades [image from Craig Batty, Adrian Dyer, Claire Perkins and Jodi Sita. 2014 Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative ( Bloomsbury, 2015) with permission].

A current issue of how to interpret eye movement data for subjects viewing a film is how such a large volume of data can be managed and statistically separated to try and interpret viewer experience. One initial solution is gaze plot analyses which shows the average attention of a number of subjects to a particular scene (Fig. 3). Investigations on still images using gaze plot analyses have indicated a strong tendency for central bias to an image that is largely independent of factors like subject matter or composition (Tatler 2007). Studies on moving images appear to confirm a tendency for viewing restricted parts of the overall image in detail (Dorr et al. 2010; Goldstein et al. 2007; Mital et al. 2011; Smith and Henderson 2008; Tosi et al. 1997), which may hold important implications for data compression type algorithms where large amounts of image data may be streamed to a variety of different mobile viewing devices such that certain information does not have to be displayed at high resolution due to the resolution of the human eye (Fig. 1), or even certain parts of the movie may be modified to enhance viewing experience for visually impaired viewers (Goldstein et al. 2007). Despite the qualitative value of gaze plot displays, quantitative analyses can be better facilitated by allocating Areas of Interest (Fig. 4) to certain components of a scene that are hypothesised to be of high value for dissecting different theories about information processing of moving images. For example, one of the current issues in understanding how eye tracking can inform film culture, and how movies can be a useful stimulus for understanding visual behaviour is having a method that can explore the potential effects of narrative which is a hypothesised top down or endogenous control on viewer gaze behaviour when subjects are freely viewing a movie to enable natural behaviour (Smith 2013).

FIGURE 3: Gaze plot shows the mean attention of a number of viewers (n=12) to a particular scene. In this case faces capture most attention consistent with previous reports (Yarbus 1967, Vassallo et al. 2009). [image from Craig Batty, Adrian Dyer, Claire Perkins and Jodi Sita. 2014 Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative ( Bloomsbury, 2015) with permission].

FIGURE 3: Gaze plot shows the mean attention of a number of viewers (n=12) to a particular scene. In this case faces capture most attention consistent with previous reports (Yarbus 1967, Vassallo et al. 2009). [image from Craig Batty, Adrian Dyer, Claire Perkins and Jodi Sita. 2014 Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative ( Bloomsbury, 2015) with permission].

Recently one study has tackled this question by using a montage sequence from the animation movie ‘Up’ (Pete Docter, 2009) to explore if it is possible to collect empirical evidence that supports modulation between bottom up and top down mechanisms. The animation montage is a high value study case as it encapsulates a lifetime of narrative story within a 262s montage film sequence that contains no dialogue (Batty 2011), and the overall salience of the two principle characters ‘Carl’ and ‘Elli’ depicted within the montage is somewhat consistently matched due to the control exhibited by animation production. For example, in an initial opening scene where these two characters are first encountered by subjects viewing the film there was an almost identical percentage of time to Carl and Elli respectively; however, as the montage unfolds with a life story narrative of marriage, dreams of children, miscarriage, dreams of travel, illness and death; there is a significant difference in the amount of attention paid to the respective characters by viewers at different stages in the montage (Batty et al. 2014). This suggests that influences of top down type processing on the overall salience of complex images as have been observed in some studies using short motion displays in laboratory type conditions (Jovancevic-Misic and Hayhoe 2009; Tatler and Kuhn 2007; Tatler 2014), is a promising avenue of investigation for movie studies if it is possible to design protocols for controlling the many factors that can influence image salience (Parkhurst et al. 2002; Martinez-Conde et al. 2004; Tatler et al. 2014).

FIGURE 4. Areas of interest can be programmed to quantify the number and respective duration of fixations to key components within a scene of a movie, which may allow for the dissection of how factors like narrative influence viewer behaviour. saccades [from Craig Batty, Adrian Dyer, Claire Perkins and Jodi Sita. 2014. Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative (Bloomsbury, 2015) with permission].

FIGURE 4. Areas of interest can be programmed to quantify the number and respective duration of fixations to key components within a scene of a movie, which may allow for the dissection of how factors like narrative influence viewer behaviour. saccades [from Craig Batty, Adrian Dyer, Claire Perkins and Jodi Sita. 2014. Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative (Bloomsbury, 2015) with permission].

Yet because eye tracking can only tell us part of the story – that is, what people look at, and not how and why these ways of looking emerge and are enacted – other qualitative research approaches such as those used in visual and sensory ethnography (Pink 2013; Pink 2015) are needed to put eye tracking data into context. This involves approaching viewing and the practices of vision that it entails as situated activities, and as part of a broader experiential repertoire beyond the eye. The subjectivity and selectivity of viewing that the studies outlined above have evidenced, once documented and measured, can only be properly understood as emergent from particular (and always complex) environmental conditions and embodied experiences. In the next section we therefore turn to anthropological approaches to vision and the environment in order to show how this might be achieved. However before proceeding we note that when working across disciplines, there is inevitably a certain amount of conceptual slippage. Here this means that whereas we have ended the previous paragraph by suggesting that eye tracking enables our understanding of how complex environmental information is processed, in the next sections we refigure this way of thinking to consider how human perception and viewer experience are constituted in relation to the affordances of complex environments of which they are also part.

Situating Viewing as Part of Complex Environments

The environment, as a concept, is slippery and is used to different empirical and political ends in different contexts. As the anthropologist Tim Ingold has emphasised, in contemporary discourses ‘the environment’ is often referred to as an entity and as something that we exist as separate from. Indeed, this idea is present in our discussion above whereby we have considered how eye tracking studies might better show us how we process complex environmental information. As Ingold expresses it this means we are ‘inclined to forget that the environment is, in the first place, a world we live in, and not a world we look at’. He argues that ‘We inhabit our environment: we are part of it; and through this practice of habitation it becomes part of us too’ (Ingold T 2011). Following this approach the environment can be understood as an ecology that humans are part of and with which we, and the ways we view, see and experience are mutually constituted. This however does not simply mean that ‘we’ as humans are encompassed by the environment, it means that the environment is co-constituted by us and our relationships with other constituents, which for our purposes in this article we would emphasise includes, film, images, art, technologies, other humans, the weather, the built environment (as well as much more). As advanced by Ingold (Ingold 2000, 2010) and art historian Barbara Stafford (Stafford 2006), approaches that critique linguistic and semiotic studies invite an analysis which acknowledges that – as Stafford puts it – ‘when you open your eyes and actively interrogate the visual scene, what you see is that aspect, or the physical fragments, of the environment that you perform’ (Stafford 2006). This however means also that the experience of film does not simply involve us looking at something that is external to us, but it is through the affordances of film that, in relation to the other constituents of our environments/worlds that viewing becomes meaningful to us. In this interpretation the use of ‘we’ derives from the development of a universal theory of human perception and our relationship to a (complex) environment. Yet, as we explain in the next section this rendering does not dismiss the idea that different people may often perceive the same information differently, and indeed to the contrary invites us to study precisely how and why difference emerges.

If we take Ingold’s approach further, to focus in on how meanings are generated through our engagements with and experiences of visual images, we can gain an appreciation of how the measurement and monitoring patterns that emerge from eye tracking studies are materialisations or representations not just of how the eye (or the mind) responds to the moving image. Rather they can be understood as standing for (but not actually explaining the meaning of) what people do with the moving image. Building on philosophical and other traditions emerging from the work of Merleau-Ponty, Gibson and Jonas, Ingold, has argued that human perception, learning and knowing emerge from movement, specifically as we move through our environments and engaging with the affordances of those other things and processes we encounter (Ingold 2000). With regard to art, he has used this approach to suggest that therefore …

Should the drawing or painting be understood as a final image to be inspected and interpreted, as is conventional in studies of visual culture, or should we rather think of it as a node in a matrix of trails to be followed by observant eyes? Are drawings or paintings of things in the world, or are they like things in the world, in the sense that we have to find our ways through and among them, inhabiting them as we do the world itself? (Ingold 2010; p16)

If we transfer this idea to the question of how we view film we might then ask the question of how we, as viewers, inhabit film? And what then eye tracking studies can tell us about these forms of habitation. If we see the relationship established between the viewer (‘s eyes) and the film by eye tracking visualisations such as those demonstrated in the earlier sections of this article, we can begin to think of how the movement of the eye and the movement of the film become entangled. Indeed while the film and the eye will both inevitably continue to move, the question becomes not simply how the composition and action in the screen influences the movement of the eye, but rather how the eye selects the aspects of the composition and action of the screen with which to move. By taking this perspective, we are able to remove something of the technological determinism that underpins assumptions that eye tracking studies might better enable film, advertising organisations to better influence viewing behaviours. Instead it directs us towards considering what eye tracking studies might tell us about what people do when they view, and how this can inform us about how they inhabit a world of which film and more generally the moving image as an ubiquitous presence.

The work and arguments discussed thus far in this section have focused on interpreting the question of how, at a general level, people see when they are viewing moving images. The theories advanced as yet however neither explain nor discuss the usefulness of attending to the patterning of eye tracking studies. Moreover the examples and visualisations we have shown of eye tracking studies in the earlier sections of this article were undertaken with a sample of people who were likely to have similar viewing perspectives, and as might therefore be expected showed distinct patterns in the ways that people view particular information. Indeed the data that would be needed to tell us to what extent such viewing patterns were universal – that is supported by studies and theories of the ways in which the human brain processes – and to what extent they were situationally and biographically constituted for this particular group of participants still does not exist as far as we know. Such work would be of high value given the increasing globalisation of both entertainment industries and forms of activism that use visual media; where films may be distributed in markets distant to the original context to which audience experience is understood. Indeed, studies of how people learn to look and know, undertaken in culturally specific contexts definitely reveal that where we look and what we see is contingent on processes of learning and apprenticeship, and therefore specific to complex environments.

Vision, Learning and Knowing

Eye tracking studies have shown us that there are sometimes similarities and patterns in the ways people view and remember complex images (Norton and Stark 1971), although if present such patterns are easily changed through instruction (Yarbus 1967; Tatler 2014). We have seen in the earlier sections of this manuscript that participants in studies have consistently fixed their gaze on the faces of film characters (Figs 2, 3), and that visual attention may become focused on a film character whose story line commands (or affords) particularly powerful affective and/or empathetic connections for viewers. Further eye tracking research would be needed to underpin any proposals that such ways of viewing are both gendered and culturally specific, however existing research in visual and media anthropology indicates that this is likely to be the case. Two bodies of literature are relevant here. First the applied visual and media anthropology literature, and second the anthropology of vision.

Applied visual and media anthropology studies (Pink 2007) focus on using anthropological understandings of media, along with audiovisual interventions (often in the form of filmmaking processes and film products) to work towards new forms of social and public awareness, and societal change. This work draws on and advances a strand in film studies developed in the work of Laura Marks, who has advanced the idea of the ‘embodied viewing experience’ (2000: 211). Marks, whose work focuses on intercultural cinema has argued that as ‘a mimetic medium’ cinema is ‘capable of drawing us into sensory participation with its world’ (Marks 2000: 214). The notion of empathy as a route towards creating intercultural understanding through film is also increasingly popular in the visual anthropology literature (discussed in Pink 2015). While on the whole there has been insufficient research into the ways in which people view intervention films of this kind, one example that has been undertaken implies how viewer attention, and importantly viewer’s capacity to engage with and remember film narrative can depend on the ways in which they are able to affectively or empathetically engage with the experiences of film characters. Susan Levine’s media anthropology study of how viewers discussed a film made as part of a South African HIV/AIDs intervention campaign, and which drew on local narratives to communicate the central message, is a good example (Levine 2007). Levine (unsurprisingly) found that participants engaged with the stories of film characters that followed locally relevant narratives, thus generating important lessons for filmmaking campaigns of this kind, where it is often difficult to communicate generic health messages to local audiences. The bridge between this type of anthropological understanding and a capacity to map viewer attention to faces and expressions within visual representations (Vassallo et al. 2009) may allow for more comprehensive understandings of why film is such a powerful medium for communication.

Anthropological studies of vision provide further evidence of the importance of attending to how seeing is situated. Indeed when vision is understood as a practice, rather than as a behavior, it is not just a situated practice, but it is a practice that is learned through participation. The anthropologist Cristina Grasseni has developed a theory of what she calls ‘skilled vision’ though which to explain this (Grasseni 2004, 2007, 2011), as she puts it:

The “skilled visions” approach considers vision as a social activity, a proactive engagement with the world, a realm of expertise that depends heavily on trained perception and on a structured environment (Grasseni 2011).

Emphasizing that skilled visions are ‘positional, political and relational’ as well as sensuous and corporeal, Grasseni points out that ‘Because skilled visions combine aspects of embodiment (as an educated capacity for selective perception) and of apprenticeship, they are both ecological and ideological, in the sense that they inform worldviews and practice’ (Grasseni 2011). As Pink has shown through her work on the Spanish bullfight, what one sees when viewing the performance is highly contingent on how one has learned to view, ones own empathetic embodied ways of sensorially and affectively ‘feeling’ the performance at which a visual representation was created, or how one’s existing ways of knowing and understanding the world can inform perception (Pink 1997; 2011). For example, consider the different ways in which Figure 5, or a film sequence around the same performance, would be interpreted by a bullfighting fan and an animal rights activist. Each will have learned how and what to know about this performance through different trajectories. Whilst an eye tracking investigation of respective subjects might show somewhat similar patterns (especially if bottom up mechanisms dominate), the semantic interpretation of the visual input by respective viewers may be completely different. How such information content might be assessable, or not, through evaluation of bottom up or top down type mechanisms involved with visual processing will be a major challenge for interpretation of information as complex as can typically be perceived in a movie.


Figure 5. How emotive content, as is common in many films, may influence the perception of visual images even if the same information is present to viewers remains a major topic for exploration. For example, we know that the bullfight is interpreted, and affectively experienced, very differently when viewed by bullfight fans and animal rights activists. We also know that learning how to view the bullfight, as a bullfight fan, is a process of cultural apprenticeship (see for example Pink 1997). Consider how for the above image the action of a bull fight could promote very different visual behavior depending upon cultural context, whether a subject was a bullfighting fan and an animal rights activist, or the representation was depicted as animation instead of real life, or motion compared to a still image. Copyright: Sarah Pink.

Figure 5. How emotive content, as is common in many films, may influence the perception of visual images even if the same information is present to viewers remains a major topic for exploration. For example, we know that the bullfight is interpreted, and affectively experienced, very differently when viewed by bullfight fans and animal rights activists. We also know that learning how to view the bullfight, as a bullfight fan, is a process of cultural apprenticeship (see for example Pink 1997). Consider how for the above image the action of a bull fight could promote very different visual behavior depending upon cultural context, whether a subject was a bullfighting fan and an animal rights activist, or the representation was depicted as animation instead of real life, or motion compared to a still image. Copyright: Sarah Pink.

Bringing together measurement and monitoring data with anthropologically informed ethnographic ways of knowing, which are always collaboratively crafted and sensorially and tacitly known is increasingly common. For instance in energy research a number of projects seek to combine ethnographic and energy consumption measurement data (Cosar et al. 2013). Such an approach has not yet been integrated in eye tracking studies of movies, yet this would be the next step if we were to want to understand better the significance and relevance of the types of data and knowledge that eye tracking studies can offer us, for understanding film audiences. This however presents certain challenges, which both impinge on, but are not necessarily unique to, the use of eye tracking data in audience research. The first challenge is to generate sufficient interdisciplinary understanding between the approaches involved. This article has intended to initiate that process. That is it has explained how eye tracking and anthropological-ethnographic (that is at once theoretical and practical) approaches offer different, and differently theorised perspectives on the ways in which people look at and participate in the viewing of film. It has simultaneously however suggested that these different approaches and disciplines offer something to each other that enable new questions to be asked, and therefore is able to develop deeper understandings of how audiences view film.

Future work testing human visual behaviour with complex stimuli as are typically present in movies may help build our understanding of how humans sometimes process very complex information to build an understanding of our surrounding world, but sometimes also miss salient information in complex moving images such as the Gorillas in our midst study. Current theories suggests that perceptual blindness to salient and recognisable stimuli when our attention is captured by other competing stimuli that impose a cognitive load to process (Simons and Chabris 1999; Levin et al. 2000; Memmert 2006), but more fully exploring effects of narrative or instructions, character gaze and other potential top down mechanisms will likely be fruitful contributions to our knowledge on perceptual blindness. Indeed, as discussed above in relation to anthropological and ethnographic factors, the potential role of factors like experience do appear to modulate the ability of subjects to detect a gorilla in a perceptual blindness type test (Memmert 2006), potentially suggesting that future investigations on eye tracking and movies should consider the broad range of human experience that can influence our perception. This type of research is likely to also provide for richer understandings in some ethnographic studies as researchers will have, possibly for the first time, access to precise quantitative data on whether an observer actually failed to even look at certain objects in a scene; or indeed if such information, like an unexpected gorilla in a basketball game, was viewed but not directly perceived (Memmert 2006). Many individual scenes within a film are typically short of about 4s duration and so it is often only possible for viewers to process a small percentage of the entire visual presentation in detail, especially in cases where movies are subscripted (Smith 2013). This means that elements of a film that might be essential to the complete comprehension of narrative story line may be easily missed by a percentage of an audience depending upon their individual knowledgebase, linguistic skills, attention and motivation; and eye tracking potentially offers film makers with a useful vehicle to test different demographic groups to better understand how different components of scenes might be constructed to enhance viewer experience, and also build our understanding of how we process very complex environmental information.



Acknowledgements. We are very grateful to Dr Craig Batty, Dr Claire Perkins and Dr Jodi Sita for discussions and permission to use images from their collaborative work with one of us (AGD), and for broader discussions with members of the Eye Tracking of the Moving Image research group. AGD acknowledges funding support from the Australian Research Council (LE130100112) for eye tracking equipment. We are grateful to Dr Lalina Muir for her careful proofreading of the manuscript.



Batty, Craig. 2011. Movies That Move Us: Screenwriting and the Power of the Protagonist’s Journey. Basingstoke: Palgrave Macmillan.

Batty, Craig, Dyer, Adrain, G., Perkins, Claire, and Sita, Jodi. 2015. Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative (Palgrave, forthcoming)

Cosar Jorda, P, Buswell, RA, Webb, LH, Leder Mackley, K, Morosanu, R, and Pink, Sarah. 2013. ‘Energy in the home: Everyday life and the effect on time of use.’ In The Proceedings of the 13th International Conference on Building Simulation 2013. Chambery, France. 25-28/8/2013.

Docter, P. 2009. Up. Disney-Pixar Motion Film.

Dorr, M, Martinetz, T, Gegenfurtner, KR, and Barth, E. 2010. ‘Variability of eye movements when viewing dynamic natural scenes.’ Journal of Vision 10 (28): 1-17.

Duchowski, Andrew. 2003. Eye tracking methodology: theory and practice. London: Springer-Verlag.

Dyer, Adrian, G., Found, Brian, and Rogers, Doug. 2006. ‘Visual attention and expertise for forensic signature analysis.’ Journal of Forensic Science 51: 1397–1404.

Goldstein, Robert, B., Woods, Russell,L., and Peli, Eli. 2007. ‘Where people look when watching movies: Do all viewers look at the same place?’ Computers in Biology and Medicine 37 (7): 957-964.

Grasseni, Cristina. 2004. ‘Video and ethnographic knowledge: skilled vision in the practice of breeding.’ In Working Images, edited by S Pink, L Kürti, and AI Afonso, 259-288. London: Routledge.

Grasseni, Cristina. 2007. Skilled Visions. Oxford: Berghahn.

Grasseni, Cristina. 2011. ‘Skilled Visions: Toward an Ecology of Visual Inscriptions.’ In Made to be Seen: Perspectives on the History of Visual Anthropology, edited by M. Banks and J. Ruby. Chicago: University of Chicago Press.

Horsely, Mike. 2014. ‘Eye Tracking as a Research Method in Social and Marketing Applications.’ In Current Trends in Eye Tracking Research, edited by M Horsley et al., 179-182. Springer, London.

Ingold, Tim. 2000. The Perception of the Environment. London: Routledge.

Ingold, Tim. 2010. ‘Ways of mind-walking: reading, writing, painting.’ Visual Studies, 25 (1): 15–23

Ingold, Tim. 2011. Being Alive. Oxford: Routledge. p 95.

Jovancevic-Misic, Jelena, and Hayhoe, Mary. 2009. ‘Adaptive Gaze Control in Natural Environments.’ Journal of Neuroscience 29 (19): 6234–6238. DOI:10.1523/JNEUROSCI.5570-08.2009.

Kustov, Alexander, A., and Robinson, David Lee. 1996. ‘Shared neural control of attentional shifts and eye movements.’ Nature 384: 74–77.

Levine, Susan. 2007. ‘Steps for the Future: HIV/AIDS, Media Activism and Applied Visual Anthropology in Southern Africa.’ In Visual Interventions, edited by S. Pink, 71-89. Oxford: Berghahn.

Marks, Laura. 2000. The Skin of the Film: Intercultural Cinema, Embodiment, and the Senses. Durham and London: Duke University Press

Martinez-Conde, Susana, Macknik, Stephen, L., and Hubel, David, H. 2004. ‘The role of fixational eye movements in visual perception.’ Nature Neuroscience 5: 229–240.

Memmert, Daniel. 2006. ‘The effects of eye movements, age, and expertise on inattentional blindness.’ Consciousness and Cognition 15 (3): 620–627.

Mital, Parag, K., Smith, Tim,J., Hill, Robin, L., and Henderson, John, M. 2011. ‘Clustering of gaze during dynamic scene viewing is predicted by motion.’ Cognitive Computation 3, 5–24.

Nodine. Calvin, F., Mello-Thoms. Claudia, Kundel. Harold, L., and Weinstein, Susan, P. 2002. ‘Time course of perception and decision making during mammographic interpretation.’ American Journal Roentgenol 179: 917–923

Norton, David, and Stark, Lawrence. 1971. ‘Scanpaths in eye movements during pattern perception.’ Science 171: 308–311.

Parkhurst, Derrick, Law, Klinton, and Niebur, Ernst. 2002. ‘Modeling the role of salience in the allocation of overt visual attention.’ Vision Research 42: 107–123.

Pashler, Harold. 1998. Attention. Hove, UK: Psychology Press Ltd.

Russo, Francesco, Pitzalis, Sabrina, and Spinell, Donatella. 2003. ‘Fixation stability and saccadic latency in elite shooters.’ Vision Research 43: 1837–1845.

Pink, Sarah. 1997. Women and Bullfighting. Oxford: Berghahn.

Pink, Sarah. 2007. (ed) Visual Interventions. Oxford: Berghahn.

Pink, Sarah. 2011. ‘From Embodiment to Emplacement: re-thinking bodies, senses and spatialities.’ In Sport, Education and Society (SES), special issue on New Directions, New Questions. Social Theory, Education and Embodiment 16(34): 343-355.

Pink, Sarah. 2013. Doing Visual Ethnography, 3rd edition. London: Sage.

Pink, Sarah. 2015 Doing Sensory Ethnography, 2nd edition London: Sage.

Simons, Daniel, J., and Chabris, Christopher, F. 1999. Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception 28(9): 1059-1074.

Smith, Tim. J., and Henderson, Jordan. 2008. ‘Edit blindness: The relationship between attention and global change blindness in dynamic scenes.’ Journal of Eye Movement Research 2: 1–17.

Smith, Tim, J. 2013. ‘Watching you watch movies: Using eye tracking to inform cognitive film theory.’ In Psychocinematics: Exploring Cognition at the Movies edited by A. P. Shimamura, 165-191. New York: Oxford University Press

Smith, T, Levin, D, and Cutting J. 2012. ‘A Window on Reality: Perceiving Edited Moving Images.’ Current Directions in Psychological Science 21(2): 107-113. doi: 10.1177/0963721412437407

Smith, Tim, j., and Mital, Parag, K. 2013. ‘Attentional synchrony and the influence of viewing task on gaze behaviour in static and dynamic scenes.’ Journal of Vision 13 (8): 16.

Stafford, Barbara Maria. 2006. Echo Objects: the Cognitive Work of Images. Chicago: University of Chicago Press.

Tatler, Ben, W. 2007. ‘The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions.’ Journal of Vision 7(14): 4, 1–17. http:// www.journalofvision.org/content/7/14/4, doi:10.1167/ 7.14.4.

Tatler, Ben, W. 2014. ‘Eye Movements from Laboratory to Life.’ In Current Trends in Eye Tracking Research edited by Horsley et al., p17-35.

Tatler, Ben, W., and Kuhn, Gustav. 2007. ‘Don’t look now: The magic of misdirection.’ In Eye Movements: A window on mind and brain, edited by R van Gopel, M Fischer, W Murray and R Hill, 697–714. Amsterdam: Elsevier.

Tatler, Ben, W., Hayhoe, Mary, M., Land, Michael, F., and Ballard, Dana, H. 2011. ‘Eye guidance in natural vision: Reinterpreting salience.’ Journal of Vision 11 (5): 1–23. http://www.journalofvision.org/content/11/5/5, doi:10.1167/11.5.5.

Tatler, Ben, W., Kirtley, Claire, Macdonald, Ross. G., Mitchell, Katy, MA., and Savage, Steven, W. 2014. ‘The Active Eye: Perspectives on Eye Movement Research.’ In Current Trends in Eye Tracking Research, 3-16. DOI 10.1007/978-3-319-02868-2_16 Print ISBN 978-3-319-02867-5 Online ISBN 978-3-319-02868-2

Treuting, Jennifer. 2006. ‘Eye tracking and cinema: A study of film theory and visual perception.’ Society of Motion Picture and Television Engineers 115 (1): 31-40.

Tosi, Virgilio, Mecacci, Luciano, and Pasquali, Elio. 1997. ‘Scanning eye movements made when viewing film: Preliminary observations.’ International Journal of Neuroscience 92 (1/2): 47-52.

Vassallo, Suzanne, Cooper, Sian, LC., and Douglas, Jacinta, M. 2009. ‘Visual scanning in the recognition of facial affect: Is there an observer sex difference?’ Journal of Vision 9: 1-10.

Vig, Eleonora, Dorr, Michael, and Barth, Erhardt. 2009.’ Efficient visual coding and the predictability of eye movements on natural movies.’ Spatial Vision 22 (2): 397-408.

Yarbus Alfred, L. 1967. Eye Movements and Vision. New York: Plenum.

Adrian Dyer is an Associate Professor in Media and Communication at RMIT University (Australia) investigating vision in complex environments. He is an Alexander von Humboldt Fellow (Germany) and a Queen Elizabeth II Fellow (Australia), and has completed postdoctoral positions at La Trobe University and Monash University (Australia), Cambridge University (UK), and Wuerzburg and Mainz Universities (Germany).

Sarah Pink is Professor of Design and Media Ethnography at RMIT University (Australia). She is visiting/guest Professor at Halmstad University (Sweden), Loughborough University (UK), and Free University Berlin (Germany). Her most recent books include Situating Everyday Life (2012), Doing Visual Ethnography 3rd edition (2013) and Doing Sensory Ethnography 2nd edition (2015).