Read, Watch, Listen: A commentary on eye tracking and moving images – Tim J. Smith

Abstract

Eye tracking is a research tool that has great potential for advancing our understanding of how we watch movies. Questions such as how differences in the movie influences where we look and how individual differences between viewers alters what we see can be operationalised and empirically tested using a variety of eye tracking measures. This special issue collects together an inspiring interdisciplinary range of opinions on what eye tracking can (and cannot) bring to film and television studies and practice. In this article I will reflect on each of these contributions with specific focus on three aspects: how subtitling and digital effects can reinvigorate visual attention, how audio can guide and alter our visual experience of film, and how methodological, theoretical and statistical considerations are paramount when trying to derive conclusions from eye tracking data.

 

Introduction

I have been obsessed with how people watch movies since I was a child. All you have to do is turn and look at an audience member’s face at the movies or at home in front of the TV to see the power the medium holds over them. We sit enraptured, transfixed and immersed in the sensory patterns of light and sound projected back at us from the screen. As our physical activity diminishes our mental activity takes over. We piece together minimal audiovisual cues to perceive rich otherworldly spaces, believable characters and complex narratives that engage us mentally and move us emotionally. As I progressed through my education in Cognitive Science and Psychology I was struck by how little science understood about cinema and the mechanisms filmmakers used to create this powerful experience.[i] Reading the film literature, listening to filmmakers discuss their craft and excavating gems of their craft knowledge I started to realise that film was a medium ripe for psychological investigation. The empirical study of film would further our understanding of how films work and how we experience them but it would also serve as a test bed for investigating complex aspects of real-world cognition that were often considered beyond the realms of experimentation. As I (Smith, Levin & Cutting, 2010) and others (Anderson, 2006) have argued elsewhere, film evolved to “piggy back” normal cognitive development and use basic cognitive tendencies such as attentional preferences, theory of mind, empathy and narrative structuring of memory to make the perception of film as enjoyable and effortless as possible. By investigating film cognition we can, in turn advance our understanding of general cognition. But to do so we need to step outside of traditional disciplinary boundaries concerning the study of film and approach the topic from an interdisciplinary perspective. This special issue represents a highly commendable attempt to do just that.

By bringing together psychologists, film theorists, philosophers, vision scientists, neuroscientists and screenwriters this special issue (and the Melbourne research group that most contributors belong to) provides a unique perspective on film viewing. The authors included in this special issue share my passion for understanding the relationship between viewers and film but this interest manifests in very different ways depending on their perspectives (see Redmond, Sita, and Vincs, this issue; for a similar personal journey into eye tracking as that presented above). By focussing on viewer eye movements the articles in this special issue provide readers from a range of disciplines a way into the eye tracking investigation of film viewing. Eye tracking (as comprehensively introduced and discussed by Dyer and Pink, this issue) is a powerful tool for quantifying a viewer’s experience of a film, comparing viewing behaviour across different viewing conditions and groups as well as testing hypotheses about how certain cinematic techniques impact where we look. But, as is rightly highlighted by several of the authors in this special issue eye tracking is not a panacea for all questions about film spectatorship.

Like all experimental techniques it can only measure a limited range of psychological states and behaviours and the data it produces does not say anything in and of itself. Data requires interpretation. Interpretation can take many forms[ii] but if conclusions are to be drawn about how the data relates to psychological states of the viewer this interpretation must be based on theories of psychology and ideally confirmed using secondary/supporting measures. For example, the affective experience of a movie is a critical aspect which cognitive approaches to film are often wrongly accused of ignoring. Although, cognitive approaches to film often focus on how we comprehend narratives (Magliano and Zacks, 2011), attend to the image (Smith, 2013) or follow formal patterns within a film (Cutting, DeLong and Nothelfer, 2010) several cognitivists have focussed in depth on emotional aspects (see the work of Carl Plantinga, Torben Grodal or Murray Smith). Eye tracking is the perfect tool for investigating the impact of immediate audiovisual information on visual attention but it is less suitable for measuring viewer affect. Psychophysiological measures such as heart rate and skin conductance, neuroimaging methods such as fMRI or EEG, or even self-report ratings may be better for capturing a viewer’s emotional responses to a film as has been demonstrated by several research teams (Suckfull, 2000; Raz et al, 2014). Unless the emotional state of the viewer changed where they looked or how quickly they moved their eyes the eye tracker may not detect any differences between two viewers with different emotional states.[iii]

As such, a researcher interested in studying the emotional impact of a film should either choose a different measurement technique or combine eye tracking with another more suitable technique (Dyer and Pink, this issue). This does not mean that eye tracking is unsuitable for studying the cinematic experience. It simply means that you should always choose the right tool for the job and often this means combining multiple tools that are strong in different ways. As Murray Smith (the current President of the Society for the Cognitive Study of the Moving Images; SCSMI) has argued, a fully rounded investigation of the cinematic experience requires “triangulation” through the combination of multiple perspectives including psychological, neuroscientific and phenomenological/philosophical theory and methods (Smith, 2011) – an approach taken proudly across this special issue.

For the remainder of my commentary I would like to focus on certain themes that struck me as most personally relevant and interesting when reading the other articles in this special issue. This is by no means an exhaustive list of the themes raised by the other articles or even an assessment of the importance of the particular themes I chose to select. There are many other interesting observations made in the articles I do not focus on below but given my perspective as a cognitive scientist and current interests I decided to focus my commentary on these specific themes rather than make a comprehensive review of the special issues or tackle topics I am unqualified to comment on. Also, I wanted to take the opportunity to dispel some common misconceptions about eye tracking (see the section ‘Listening to the data’) and empirical methods in general.

Reading an image

One area of film cognition that has received considerable empirical investigation is subtitling. As Kruger, Szarkowska and Krejtz (this issue) so comprehensively review, they and I believe eye tracking is the perfect tool for investigating how we watch subtitled films. The presentation of subtitles divides the film viewing experience into a dual- task: reading and watching. Given that the media was originally designed to communicate critical information through two channels, the image and soundtrack introducing text as a third channel of communication places extra demands on the viewer’s visual system. However, for most competent readers serially shifting attention between these two tasks does not lead to difficulties in comprehension (Kruger, Szarkowska and Krejtz, this issue). Immediately following the presentation of the subtitles gaze will shift to the beginning of the text, saccade across the text and return to the centre of interest within a couple of seconds. Gaze heatmaps comparing the same scenes with and without subtitles (Kruger, Szarkowska and Krejtz, this issue; Fig. 3) show that the areas of the image fixated are very similar (ignoring the area of the screen occupied by the subtitles themselves) and rather than distracting from the visual content the presence of subtitles seems to actually condense the gaze behaviour on the areas of central interest in an image, e.g. faces and the centre of the image. This illustrates the redundancy of a lot of the visual information presented in films and the fact that under non-subtitle conditions viewers rarely explore the periphery of the image (Smith, 2013).

My colleague Anna Vilaró and I recently demonstrated this similarity in an eye tracking study in which the gaze behaviour of viewers was compared across versions of an animated film, Disney’s Bolt (Howard & Williams, 2008) either in the original English audio condition, a Spanish language version with English subtitles, an English language version with Spanish subtitles and a Spanish language version without subtitles (Vilaró, & Smith, 2011). Given that our participants were English speakers who did not know Spanish these conditions allowed us to investigate both where they looked under the different audio and subtitle conditions but also what they comprehended. Using cued recall tests of memory for verbal and visual content we found no significant differences in recall for either types of content across the viewing conditions except for verbal recall in the Spanish-only condition (not surprisingly given that our English participants couldn’t understand the Spanish dialogue). Analysis of the gaze behaviour showed clear evidence of subtitle reading, even in the Spanish subtitle condition (see Figure 1) but no differences in the degree to which peripheral objects were explored. This indicates that even when participants are watching film sequences without subtitles and know that their memory will be tested for the visual content their gaze still remains focussed on central features of a traditionally composed film. This supports arguments for subtitling movies over dubbing as, whilst placing greater demands on viewer gaze and a heightened cognitive load there is no evidence that subtitling leads to poorer comprehension.

Figure 1: Figure from Vilaró & Smith (2011) showing the gaze behaviour of multiple viewers directed to own language subtitles (A) and foreign language/uninterpretable subtitles (B).

Figure 1: Figure from Vilaró & Smith (2011) showing the gaze behaviour of multiple viewers directed to own language subtitles (A) and foreign language/uninterpretable subtitles (B).

The high degree of attentional synchrony (Smith and Mital, 2013) observed in the above experiment and during most film sequences indicates that all visual features in the image and areas of semantic significance (e.g. social information and objects relevant to the narrative) tend to point to the same part of the image (Mital, Smith, Hill and Henderson, 2011). Only when areas of the image are placed in conflict through image composition (e.g. depth of field, lighting, colour or motion contrast) or staging (e.g. multiple actors) does attentional synchrony break down and viewer gaze divide between multiple locations. Such shots are relatively rare in mainstream Hollywood cinema or TV (Salt, 2009; Smith, 2013) and when used the depicted action tends to be highly choreographed so attention shifts between the multiple centres of image in a predictable fashion (Smith, 2012). If such choreographing of action is not used the viewer can quickly exhaust the information in the image and start craving either new action or a cut to a new shot.

Hochberg and Brooks (1978) referred to this as the visual momentum of the image: the pace at which visual information is acquired. This momentum is directly observable in the saccadic behaviour during an images presentation with frequent short duration fixations at the beginning of a scene’s presentation interspersed by large amplitude saccades (known as the ambient phase of viewing; Velichovsky, Dornhoefer, Pannasch and Unema, 2000) and less frequent, longer duration fixations separated by smaller amplitude saccades as the presentation duration increases (known as the focal phase of viewing; Velichovsky et al., 2000). I have recently demonstrated the same pattern of fixations during viewing of dynamic scenes (Smith and Mital, 2013) and shown how this pattern gives rise to more central fixations at shot onset and greater exploration of the image and decreased attentional synchrony as the shot duration increases (Mital, Smith, Hill and Henderson, 2011). Interestingly, the introduction of subtitles to a movie may have the unintended consequence of sustaining visual momentum throughout a shot. The viewer is less likely to exhaust the information in the image because their eyes are busy saccading across the text to acquire the information that would otherwise be presented in parallel to the image via the soundtrack. This increased saccadic activity may increase the cognitive load experienced by viewers of subtitled films and change their affective experience, producing greater arousal and an increased sense of pace.

For some filmmakers and producers of dynamic visual media, increasing the visual momentum of an image sequence may be desirable as it maintains interest and attention on the screen (e.g. Michael Bay’s use of rapidly edited extreme Close-Ups and intense camera movements in the Transformer movies). In this modern age of multiple screens fighting for our attention when we are consuming moving images (e.g. mobile phones and computer screens in our living rooms and even, sadly increasingly at the cinema) if the designers of this media are to ensure that our visual attention is focussed on their screen over the other competing screens they need to design the visual display in a way that makes comprehension impossible without visual attention. Feature Films and Television dramas often rely heavily on dialogue for narrative communication and the information communicated through the image may be of secondary narrative importance to the dialogue so viewers can generally follow the story just by listening to the film rather than watching it. If producers of dynamic visual media are to draw visual attention back to the screen and away from secondary devices they need to increase the ratio of visual to verbal information. A simple way of accomplishing this is to present the critical audio information through subtitling. The more visually attentive mode of viewing afforded by watching subtitled film and TV may partly explain the growing interest in foreign TV series (at least in the UK) such as the popularity of Nordic Noir series such as The Bridge (2011) and The Killing (2007).

Another way of drawing attention back to the screen is to constantly “refresh” the visual content of the image by either increasing the editing rate or creatively using digital composition.[iv] The latter technique is wonderfully exploited by Sherlock (2010) as discussed brilliantly by Dwyer (this issue). Sherlock contemporised the detective techniques of Sherlock Holmes and John Watson by incorporating modern technologies such as the Internet and mobile phones and simultaneously updated the visual narrative techniques used to portray this information by using digital composition to playfully superimpose this information onto the photographic image. In a similar way to how the sudden appearance of traditional subtitles involuntarily captures visual attention and draws our eyes down to the start of the text, the digital inserts used in Sherlock overtly capture our eyes and encourage reading within the viewing of the image.

If Dwyer (this issue) had eyetracked viewers watching these excerpts she would have likely observed this interesting shifting between phases of reading and dynamic scene perception. Given that the appearance of the digital inserts produce sudden visual transients and are highly incongruous with the visual features of the background scene they are likely to involuntarily attract attention (Mital, Smith, Hill & Henderson, 2012). As such, they can be creatively used to reinvigorate the pace of viewing and strategically direct visual attention to parts of the image away from the screen centre. Traditionally, the same content may have been presented either verbally as narration, heavy handed dialogue exposition (e.g. “Oh my! I have just received a text message stating….”) or as a slow and laboured cut to close-up of the actual mobile phone so we can read it from the perspective of the character. Neither takes full advantage of the communicative potential of the whole screen space or our ability to rapidly attend to and comprehend visual information and audio information in parallel.

Such intermixing of text, digital inserts and filmed footage is common in advertisements, music videos, and documentaries (see Figure 2) but is still surprisingly rare in mainstream Western film and TV. Short-form audiovisual messages have recently experienced a massive increase in popularity due to the internet and direct streaming to smartphones and mobile devices. To maximise their communicative potential and increase their likelihood of being “shared” these videos use all audiovisual tricks available to them. Text, animations, digital effects, audio and classic filmed footage all mix together on the screen, packing every frame with as much info as possible (Figure 2), essentially maximising the visual momentum of each video and maintaining interest for as long as possible.[v] Such videos are so effective at grabbing attention and delivering satisfying/entertaining/informative experiences in a short period of time that they often compete directly with TV and film for our attention. Once we click play, the audiovisual bombardment ensures that our attention remains latched on to the second screen (i.e., the tablet or smartphone) for its duration and away from the primary screen, i.e., the TV set. Whilst distressing for producers of TV and Film who wish our experience of their material to be undistracted, the ease with which we pick up a handheld device and seek other stimulation in parallel to the primary experience may indicate that the primary material does not require our full attention for us to follow what is going on. As attention has a natural ebb-and-flow (Cutting, DeLong and Nothelfer, 2010) and “There is no such thing as voluntary attention sustained for more than a few seconds at a time” (p. 421; James, 1890) if modern producers of Film and TV want to maintain a high level of audience attention and ensure it is directed to the screen they must either rely on viewer self-discipline to inhibit distraction, reward attention to the screen with rich and nuanced visual information (as fans of “slow cinema” would argue of films like those of Bela Tarr) or utilise the full range of postproduction effects to keep visual interest high and maintained on the image, as Sherlock so masterfully demonstrates.

Figure 2: Gaze Heatmaps of participants’ free-viewing a trailer for Lego Indiana Jones computer game (left column) and the Video Republic documentary (right column). Notice how both make copious use of text within the image, as intertitles and as extra sources of information in the image (such as the head-up display in A3). Data and images were taken from the Dynamic Images and Eye Movement project (DIEM; Mital, Smith, Hill & Henderson, 2010). Videos can be found here (http://vimeo.com/6628451) and here (http://vimeo.com/2883321).

Figure 2: Gaze Heatmaps of participants’ free-viewing a trailer for Lego Indiana Jones computer game (left column) and the Video Republic documentary (right column). Notice how both make copious use of text within the image, as intertitles and as extra sources of information in the image (such as the head-up display in A3). Data and images were taken from the Dynamic Images and Eye Movement project (DIEM; Mital, Smith, Hill & Henderson, 2010). Videos can be found here (http://vimeo.com/6628451) and here (http://vimeo.com/2883321).

A number of modern filmmakers are beginning to experiment with the language of visual storytelling by questioning our assumptions of how we perceive moving images. Forefront in this movement are Ang Lee and Andy and Lana Wachowski. In Ang Lee’s Hulk (2003), Lee worked very closely with editor Tim Squyers to use non-linear digital editing and after effects to break apart the traditional frame and shot boundaries and create an approximation of a comic book style within film. This chaotic unpredictable style polarised viewers and was partly blamed for the film’s poor reception. However, it cannot be argued that this experiment was wholly unsuccessful. Several sequences within the film used multiple frames, split screens, and digital transformation of images to increase the amount of centres of interest on the screen and, as a consequence increase pace of viewing and the arousal experienced by viewers. In the sequence depicted below (Figure 3) two parallel scenes depicting Hulk’s escape from a containment chamber (A1) and this action being watched from a control room by General Ross (B1) were presented simultaneously by presenting elements of both scenes on the screen at the same time. Instead of using a point of view (POV) shot to show Ross looking off screen (known as the glance shot; Branigan, 1984) followed by a cut to what he was looking at (the object shot) both shots were combined into one image (F1 and F2) with the latter shot sliding into from behind Ross’ head (E2). These digital inserts float within the frame, often gliding behind objects or suddenly enlarging to fill the screen (A2-B2). Such visual activity and use of shots-within-shots makes viewer gaze highly active (notice how the gaze heatmap is rarely clustered in one place; Figure 3). Note that this method of embedding a POV object shot within a glance shot is similar to Sherlock’s method of displaying text messages as both the glance, i.e., Watson looking at his phone, and the object, i.e., the message, are shown in one image. Both uses take full advantage of our ability to rapidly switch from watching action to reading text without having to wait for a cut to give us the information.

Figure 3: Gaze heatmap of eight participants watching a series of shots and digital inserts from Hulk (Ang Lee, 2003). Full heatmap video is available at http://youtu.be/tErdurgN8Yg.

Figure 3: Gaze heatmap of eight participants watching a series of shots and digital inserts from Hulk (Ang Lee, 2003). Full heatmap video is available at http://youtu.be/tErdurgN8Yg.

Similar techniques have been used Andy and Lana Wachowski’s films including most audaciously in Speed Racer (2008). Interestingly, both sets of filmmakers seem to intuitively understand that packing an image with as much visual and textual information as possible can lead to viewer fatigue and so they limit such intense periods to only a few minutes and separate them with more traditionally composed sequences (typically shot/reverse-shot dialogue sequences). These filmmakers have also demonstrated similar respect for viewer attention and the difficulty in actively locating and encoding visual information in a complex visual composition in their more recent 3D movies. Ang Lee’s Life of Pi (2012) uses the visual volume created by stereoscopic presentation to its full potential. Characters inhabit layers within the volume as foreground and background objects fluidly slide around each other within this space. The lessons Lee and his editor Tim Squyers learned on Hulk (2003) clearly informed the decisions they made when tackling their first 3D film and allowed them to avoid some of the issues most 3D films experience such as eye strain, sudden unexpected shifts in depth and an inability to ensure viewers are attending to the part of the image easiest to fuse across the two eye images (Banks, Read, Allison & Watt, 2012).

Watching Audio

I now turn to another topic featured in this special issue, the influence of audio on gaze (Robinson, Stadler and Rassell, this issue). Film and TV are inherently multimodal. Both media have always existed as a combination of visual and audio information. Even early silent film was almost always presented with either live musical accompaniment or a narrator. As such, the relative lack of empirical investigation into how the combination of audio and visual input influences how we perceive movies and, specifically how we attend to them is surprising. Robinson, Stadler and Rassell (this issue) have attempted to address this omission by comparing eye movements for participants either watching the original version of the Omaha beach sequence from Steven Spielberg’s Saving Private Ryan (1998) or the same sequence with the sound removed. This film sequence is a great choice for investigating AV influences on viewer experience as the intensity of the action, the hand-held cinematography and the immersive soundscape all work together to create a disorientating embodied experience for the viewer. The authors could have approached this question by simply showing a set of participants the sequence with audio and qualitatively describing the gaze behaviour at interesting AV moments during the sequence. Such description of the data would have served as inspiration for further investigation but in itself can’t say anything about the causal contribution of audio to this behaviour as there would be nothing to compare the behaviour to. Thankfully, the authors avoided this problem by choosing to manipulate the audio.

In order to identify the causal contribution of any factor you need to design an experiment in which that factor (known as the Independent Variable) is either removed or manipulated and the significant impact of this manipulation on the behaviour of interest (known as the Dependent Variable) is tested using appropriate inferential statistics. I commend Robinson, Stadler and Rassell’s experimental design as they present such an manipulation and are therefore able to produce data that will allow them to test their hypotheses about the causal impact of audio on viewer gaze behaviour. Several other papers in this special issue (Redmond, Sita and Vincs; Batty, Perkins and Sita) discuss gaze data (typically in the form of scanpaths or heatmaps) from one viewing condition without quantifying its difference to another viewing condition. As such, they are only able to describe the gaze data, not use it to test hypotheses. There is always a temptation to attribute too much meaning to a gaze heatmap (I too am guilty of this; Smith, 2013) due to their seeming intuitive nature (i.e., they looked here and not there) but, as in all psychological measures they are only as good as the experimental design within which there are employed.[vi]

Qualitative interpretation of individual fixation locations, scanpaths or group heatmaps are useful for informing initial interpretation of which visual details are most likely to make it into later visual processing (e.g. perception, encoding and long term memory representations) but care has to be taken in falsely assuming that fixation equals awareness (Smith, Lamont and Henderson, 2012). Also, the visual form of gaze heatmaps vary widely depending on how many participants contribute to the heatmap, which parameters you choose to generate the heatmaps and which oculomotor measures the heatmap represent (Holmqvist, et al., 2011). For example, I have demonstrated that unlike during reading visual encoding during scene perception requires over 150ms during each fixation (Rayner, Smith, Malcolm and Henderson, 2009). This means that if fixations with durations less than 150ms are included in a heatmap it may suggest parts of the image have been processed which in actual fact were fixated too briefly to be processed adequately. Similarly, heatmaps representing fixation duration instead of just fixation location have been shown to be a better representation of visual processing (Henderson, 2003). Heatmaps have an immediate allure but care has to be taken about imposing too much meaning on them especially when the gaze and the image are changing over time (see Smith and Mital, 2013; and Sawahata et al, 2008 for further discussion). As eye tracking hardware becomes more available to researchers from across a range of disciplines we need to work harder to ensure that it is not used inappropriately and that the conclusions that are drawn from eye tracking data are theoretically and statistically motivated (see Rayner, 1998; and Holmqvist et al, 2013 for clear guidance on how to conduct sound eye tracking studies).

Given that Robinson, Stadler and Rassell (this issue) manipulated the critical factor, i.e., the presence of audio the question now is whether their study tells us anything new about the AV influences on gaze during film viewing. To examine the influence of audio they chose two traditional methods for expressing the gaze data: area of interest (AOI) analysis and dispersal. By using nine static (relative to the screen) AOIs they were able to quantify how much time the gaze spent in each AOI and utilise this measure to work out how distributed gaze was across all AOIs. Using these measures they reported a trend towards greater dispersal in the mute condition compared to the audio condition and a small number of significant differences in the amount of time spent in some regions across the audio conditions.

However, the conclusions we can draw from these findings are seriously hindered by the low sample size (only four participants were tested, meaning that any statistical test is unlikely to reveal significant differences) and the static AOIs that did not move with the image content. By locking the AOIs to static screen coordinates their AOI measures express the deviation of gaze relative to these coordinates, not to the image content. This approach can be informative for quantifying gaze exploration away from the screen centre (Mital, Smith, Hill and Henderson, 2011) but in order to draw conclusions about what was being fixated the gaze needs to be quantified relative to dynamic AOIs that track objects of interest on the screen (see Smith an Mital, 2013). For example, their question about whether we fixate a speaker’s mouth more in scenes where the clarity of the speech is difficult due to background noise (i.e., their “Indistinct Dialogue” scene) has previously been investigated in studies that have manipulated the presence of audio (Võ, Smith, Mital and Henderson, 2012) or the level of background noise (Buchan, Paré and Munhall, 2007) and measured gaze to dynamic mouth regions. As Robinson, Stadler and Rassell correctly predicted, lip reading increases as speech becomes less distinct or the listener’s linguistic competence in the spoken language decreases (see Võ et al, 2012 for review).

Similarly, by measuring gaze dispersal using a limited number of static AOIs they are losing considerable nuance in the gaze data and have to resort to qualitative description of unintuitive bar charts (figure 4). There exist several methods for quantifying gaze dispersal (see Smith and Mital, 2013, for review) and even open-source tools for calculating this measure and comparing dispersal across groups (Le Meur and Baccino, 2013). Some methods are as easy, if not easier to calculate than the static AOIs used in the present study. For example, the Euclidean distance between the screen centre and the x/y gaze coordinates at each frame of the movie provides a rough measure of how spread out the gaze is from the screen centre (typically the default viewing location; Mital et al, 2011) and a similar calculation can be performed between the gaze position of all participants within a viewing condition to get a measure of group dispersal.

Using such measures, Coutrot and colleagues (2012) showed that gaze dispersal is greater when you remove audio from dialogue film sequences and they have also observed shorter amplitude saccades and marginally shorter fixation durations. Although, I have recently shown that a non-dialogue sequence from Sergei Eisenstein’s Alexander Nevsky (1938) does not show significant differences in eye movement metrics when the accompanying music is removed (Smith, 2014). This difference in findings points towards interesting differences in the impact diegetic (within the depicted scene, e.g. dialogue) and non-diegetic (outside of the depicted scene, e.g. the musical score) may have on gaze guidance. It also highlights how some cinematic features may have a greater impact on other aspects of a viewer’s experience than those measureable by eye tracking such as physiological markers of arousal and emotional states. This is also the conclusion that Robinson, Stadler and Rassell come to.    

Listening to the Data (aka, What is Eye Tracking Good For?)

The methodological concerns I have raised in the previous section lead nicely to the article by William Brown, entitled There’s no I in Eye Tracking: How useful is Eye Tracking to Film Studies (this issue). I have known William Brown for several years through our attendance of the Society for Cognitive Studies of the Moving Image (SCSMI) annual conference and I have a deep respect for his philosophical approach to film and his ability to incorporate empirical findings from the cognitive neurosciences, including some references to my own work into his theories. Therefore, it comes somewhat as a surprise that his article openly attacks the application of eye tracking to film studies. However, I welcome Brown’s criticisms as it provides me with an opportunity to address some general assumptions about the scientific investigation of film and hopefully suggest future directions in which eye tracking research can avoid falling into some of the pitfalls Brown identifies.

Brown’s main criticisms of current eye tracking research are: 1) eye tracking studies neglect “marginal” viewers or marginal ways of watching movies; 2) studies so far have neglected “marginal” films; 3) they only provide “truisms”, i.e., already known facts; and 4) they have an implicit political agenda to argue that the only “true” way to study film is a scientific approach and the “best” way to make a film is to ensure homogeneity of viewer experience. I will address these criticisms in turn but before I do so I would like to state that a lot of Brown’s arguments could generally be recast as an argument against science in general and are built upon a misunderstanding of how scientific studies should be conducted and what they mean.

To respond to Brown’s first criticism that eye tracking “has up until now been limited somewhat by its emphasis on statistical significance – or, put simply, by its emphasis on telling us what most viewers look at when they watch films” (Brown, this issue; 1), I first have to subdivide the criticism into ‘the search for significance’ and ‘attentional synchrony’, i.e., how similar gaze is across viewers (Smith and Mital, 2013). Brown tells an anecdote about a Dutch film scholar who’s data had to be excluded from an eye tracking study because they did not look where the experimenter wanted them to look. I wholeheartedly agree with Brown that this sounds like a bad study as data should never be excluded for subjective reasons such as not supporting the hypothesis, i.e., looking as predicted. However, exclusion due to statistical reasons is valid if the research question being tested relates to how representative the behaviour of a small set of participants (known as the sample) are to the overall population. To explain when such a decision is valid and to respond to Brown’s criticism about only ‘searching for significance’ I will first need to provide a brief overview of how empirical eye tracking studies are designed and why significance testing is important.

For example, if we were interested in the impact sound had on the probability of fixating an actor’s mouth (e.g., Robinson, Stadler and Rassell, this issue) we would need to compare the gaze behaviour of a sample of participants who watch a sequence with the sound turned on to a sample who watched it with the sound turned off. By comparing the behaviour between these two groups using inferential statistics we are testing the likelihood that these two viewing conditions would differ in a population of all viewers given the variation within and between these two groups. In actual fact we do this by performing the opposite test: testing the probability that that the two groups belong to a single statistically indistinguishable group. This is known as the null hypothesis. By showing that there is less than a 5% chance that the null hypothesis is true we can conclude that there is a statistically significant chance that another sample of participants presented with the same two viewing conditions would show similar differences in viewing behaviour.

In order to test whether our two viewing conditions belong to one or two distributions we need to be able to express this distribution. This is typically done by identifying the mean score for each participant on the dependent variable of interest, in this case the probability of fixating a dynamic mouth AOI then calculating the mean for this measure across all participants within a group and their variation in scores (known as the standard deviation). Most natural measures produce a distribution of scores looking somewhat like a bell curve (known as the normal distribution) with most observations near the centre of the distribution and an ever decreasing number of observations as you move away from this central score. Each observation (in our case, participants) can be expressed relative to this distribution by subtracting the mean of the distribution from its score and dividing by the standard deviation. This converts a raw score into a normalized or z-score. Roughly ninety-five percent of all observations will fall within two standard deviations of the mean for normally distributed data. This means that observations with a z-score greater than two are highly unrepresentative of that distribution and may be considered outliers.

However, being unrepresentative of the group mean is insufficient motivation to exclude a participant. The outlier still belongs to the group distribution and should be included unless there is a supporting reason for exclusion such as measurement error, e.g. poor calibration of the eye tracker. If an extreme outlier is not excluded it can often have a disproportionate impact on the group mean and make statistical comparison of groups difficult. However, if this is the case it suggests that the sample size is too small and not representative of the overall population. Correct choice of sample size given an estimate of the predicted effect size combined with minimising measurement error should mean that subjective decisions do not have to be made about who’s data is “right” and who should be included or excluded.

Brown also believes that eye tracking research has so far marginalised viewers who have atypical ways of watching film, such as film scholars either by not studying them or treating them as statistical outliers and excluding them from analyses. However, I would argue that the only way to know if their way of watching a film is atypical is to first map out the distribution of how viewers typically watch films. If a viewer attended more to the screen edge than the majority of other viewers in a random sample of the population (as was the case with Brown’s film scholar colleague) this should show up as a large z-score when their gaze data is expressed relative to the group on a suitable measure such as Euclidean distance from the screen centre. Similarly, a non-native speaker of English may have appeared as an outlier in terms of how much time they spent looking at the speaker’s mouth in Robinson, Stadler and Rassell’s (this issue) study. Such idiosyncrasies may be of interest to researchers and there are statistical methods for expressing emergent groupings within the data (e.g. cluster analysis) or seeing whether group membership predicts behaviour (e.g. regression). These approaches may have not previously been applied to questions of film viewing but this is simply due to the immaturity of the field and the limited availability of the equipment or expertise to conduct such studies.

In my own recent work I have shown how viewing task influences how we watch unedited video clips (Smith and Mital, 2013), how infants watch TV (Wass and Smith, in press), how infant gaze differs to adult gaze (Smith, Dekker, Mital, Saez De Urabain and Karmiloff-Smith, in prep) and even how film scholars attend to and remember a short film compared to non-expert film viewers (Smith and Smith, in prep). Such group viewing differences are of great interest to me and I hope these studies illustrate how eye tracking has a lot to offer to such research questions if the right statistics and experimental designs are employed.

Brown’s second main criticism is that the field of eye tracking neglects “marginal” films. I agree that the majority of films that have so far been used in eye tracking studies could be considered mainstream. For example, the film/TV clips used in this special issue include Sherlock (2010), Up (2009) and Saving Private Ryan (1998). However, this limit is simply a sign of how few eye tracking studies of moving images there have been. All research areas take time to fully explore the range of possible research questions within that area.

I have always employed a range of films from diverse film traditions, cultures, and languages. My first published eye tracking study (Smith and Henderson, 2008) used film clips from Citizen Kane (1941), Dogville (2003), October (1928), Requiem for a Dream (2000), Dancer in the Dark (2000), Koyaanisqatsi (1982) and Blade Runner (1982). Several of these films may be considered “marginal” relative to the mainstream. If I have chosen to focus most of my analyses on mainstream Hollywood cinema this is only because they were the most suitable exemplars of the phenomena I was investigating such as continuity editing and its creation of a universal pattern of viewing (Smith, 2006; 2012). This interest is not because, as Brown argues, I have a hidden political agenda or an implicit belief that this style of filmmaking is the “right” way to make films. I am interested in this style because it is the dominant style and, as a cognitive scientist I wish to use film as a way of understanding how most people process audiovisual dynamic scenes.

Hollywood film stands as a wonderfully rich example of what filmmakers think “fits” human cognition. By testing filmmaker intuitions and seeing what impact particular compositional decisions have on viewer eye movements and behavioural responses I hope to gain greater insight into how audiovisual perception operates in non-mediated situations (Smith, Levin and Cutting, 2012). But, just as a neuropsychologist can learn about typical brain function by studying patients with pathologies such as lesions and strokes, I can also learn about how we perceive a “typical” film by studying how we watch experimental or innovative films. My previous work is testament to this interest (Smith, 2006; 2012a; 2012b; 2014; Smith & Henderson, 2008) and I hope to continue finding intriguing films to study and further my understanding of film cognition.

One practical reason why eye tracking studies rarely use foreign language films is the presence of subtitles. As has been comprehensively demonstrated by other authors in this special issue (Kruger, Szarkowska and Krejtz, this issue) and earlier in this article, the sudden appearance of text on the screen, even if it is incomprehensible leads to differences in eye movement behaviour. This invalidates the use of eye tracking as a way to measure how the filmmaker intended to shape viewer attention and perception. The alternatives would be to either use silent film (an approach I employed with October; Smith and Henderson, 2008), remove the audio (which changes gaze behaviour and awareness of editing; Smith & Martin-Portugues Santacreau, under review) or use dubbing (which can bias the gaze down to the poorly synched lips; Smith, Batten, and Bedford, 2014). None of these options are ideal for investigating foreign language sound film and until there is a suitable methodological solution this will restrict eye tracking studies to experimental films in a participant’s native language.

Finally, I would like to counter Brown’s assertion that eye tracking investigations of film have so far only generated “truisms”. I admit that there is often a temptation to reduce empirical findings to simplified take-home messages that only seem to confirm previous intuitions such as a bias of gaze towards the screen centre, towards speaking faces, moving objects or subtitles. However, I would argue that such messages fail to appreciate the nuance in the data. Empirical data correctly measured and analysed can provide subtle insights into a phenomenon that subjective introspection could never supply.

For example, film editors believe that an impression of continuous action can be created across a cut by overlapping somewhere between two (Anderson, 1996) and four frames (Dmytryk, 1986) of the action. However, psychological investigations of time perception revealed that our judgements of duration depend on how attention is allocated during the estimated period (Zakay and Block, 1996) and will vary depending on whether our eyes remain still or saccade during the period (Yarrow et al, 2001). In my thesis (Smith, 2006) I used simplified film stimuli to investigate the role that visual attention played in estimation of temporal continuity across a cut and found that participants experienced an overlap of 58.44ms as continuous when an unexpected cut occurred during fixation and an omission of 43.63ms as continuous when they performed a saccade in response to the cut. As different cuts may result in different degrees of overt (i.e., eye movements) and covert attentional shifts these empirical findings both support editor intuitions that temporal continuity varies between cuts (Dmytryk, 1986) whilst also explaining the factors that are important in influencing time perception at a level of precision not possible through introspection.

Reflecting on our own experience of a film suffers from the fact that it relies on our own senses and cognitive abilities to identify, interpret and express what we experience. I may feel that my experience of a dialogue sequence from Antichrist (2010) differs radically from a similar sequence from Secrets & Lies (1996) but I would be unable to attribute these differences to different aspects of the two scenes without quantifying both the cinematic features and my responses to them. Without isolating individual features I cannot know their causal contribution to my experience. Was it the rapid camera movements in Antichrist, the temporally incongruous editing, the emotionally extreme dialogue or the combination of these features that made me feel so unsettled whilst watching the scene? If one is not interested in understanding the causal contributions of each cinematic decision to an audience member’s response then one may be content with informed introspection and not find empirical hypothesis testing the right method. I make no judgement about the validity of either approach as long as each researcher understands the limits of their approach.

Introspection utilises the imprecise measurement tool that is the human brain and is therefore subject to distortion, human bias and an inability to extrapolate the subjective experience of one person to another. Empirical hypothesis testing also has its limitations: research questions have to be clearly formulated so that hypotheses can be stated in a way that allows them to be statistically tested using appropriate observable and reliable measurements. A failure at any of these stages can invalidate the conclusions that can be drawn from the data. For example, an eye tracker may be poorly calibrated resulting in an inaccurate record of where somebody was looking or it could be used to test an ill formed hypothesis such as how a particular film sequence caused attentional synchrony without having another film sequence to compare the gaze data to. Each approach has its strength and weaknesses and no single approach should be considered “better” than any other, just as no film should be considered “better” than any other film.

Conclusion

The articles collected here constitute the first attempt to bring together interdisciplinary perspectives on the application of eye tracking to film studies. I fully commend the intention of this special issue and hope that it encourages future researchers to conduct further studies using these methods to investigate research questions and film experiences we have not even conceived of. However, given that the recent release of low-cost eye tracking peripherals such as the EyeTribe[vii] tracker and the Tobii EyeX[viii] has moved eye tracking from a niche and highly expensive research tool to an accessible option for researchers in a range of disciplines, I need to take this opportunity to issue a word of warning. As I have outlined in this article, eye tracking is like any other research tool in that it is only useful if used correctly, its limitations are respected, its data is interpreted through the appropriate application of statistics and conclusions are only drawn that are based on the data in combination with a sound theoretical base. Eye tracking is not the “saviour” of film studies , nor is science the only “valid” way to investigate somebody’s experience of a film. Hopefully, the articles in this special issue and the ideas I have put forward here suggest how eye tracking can function within an interdisciplinary approach to film analysis that furthers our appreciation of film in previously unfathomed ways.

 

Acknowledgements

Thanks to Rachael Bedford, Sean Redmond and Craig Batty for comments on earlier drafts of this article. Thank you to John Henderson, Parag Mital and Robin Hill for help in gathering and visualising the eye movement data used in the Figures presented here. Their work was part of the DIEM Leverhulme Trust funded project (https://thediemproject.wordpress.com/). The author, Tim Smith is funded by EPSRC (EP/K012428/1), Leverhulme Trust (PLP-2013-028) and BIAL Foundation grant (224/12).

 

References

Anderson, Joseph. 1996. The Reality of Illusion: An Ecological Approach to Cognitive Film Theory. Southern. Illinois University Press.

Batty, Craig, Claire Perkins and Jodi Sita. 2015. “How We Came To Eye Tracking Animation: A Cross-Disciplinary Approach to Researching the Moving Image”, Refractory: a Journal of Entertainment Media, 25.

Banks, Martin S., Jenny R. Read, Robert S. Allison and Simon J. Watt. 2012. “Stereoscopy and the human visual system.” SMPTE Mot. Imag. J., 121 (4), 24-43

Bradley, Margaret M., Laura Miccoli, Miguel A. Escrig and Peter J. Lang. 2008. “The pupil as a measure of emotional arousal and autonomic activation.” Psychophysiology, 45(4), 602-607.

Branigan, Edward R. 1984. Point of View in the Cinema: A Theory of Narration and Subjectivity in Classical Film. Berlin: Mouton.

Brown, William. 2015. “There’s no I in Eye Tacking: How Useful is Eye Tracking to Film Studies?”, Refractory: a Journal of Entertainment Media, 25.

Buchan, Julie N., Martin Paré and Kevin G. Munhall. 2007. “Spatial statistics of gaze fixations during dynamic face processing.” Social Neuroscience, 2, 1–13.

Coutrot, Antoine, Nathalie Guyader, Gelu Ionesc and Alice Caplier. 2012. “Influence of Soundtrack on Eye Movements During Video Exploration”, Journal of Eye Movement Research 5, no. 4.2: 1-10.

Cutting, James. E., Jordan E. DeLong and Christine E. Nothelfer. 2010. “Attention and the evolution of Hollywood film.” Psychological Science, 21, 440-447.

Dwyer, Tessa. 2015. “From Subtitles to SMS: Eye Tracking, Texting and Sherlock”, Refractory: a Journal of Entertainment Media, 25.

Dyer, Adrian. G and Sarah Pink. 2015. “Movement, attention and movies: the possibilities and limitations of eye tracking?”, Refractory: a Journal of Entertainment Media, 25.

Dmytryk, Edward. 1986. On Filmmaking. London, UK: Focal Press.

Henderson, John. M., 2003. “Human gaze control during real-world scene perception.” Trends in Cognitive Sciences, 7, 498-504.

Hochberg, Julian and Virginia Brooks. 1978). “Film Cutting and Visual Momentum”. In John W. Senders, Dennis F. Fisher and Richard A. Monty (Eds.), Eye Movements and the Higher Psychological Functions (pp. 293-317). Hillsdale, NJ: Lawrence Erlbaum.

Holmqvist, Kenneth, Marcus Nyström, Richard Andersson, Richard Dewhurst, Halszka Jarodzka and Joost van de Weijer. 2011. Eye Tracking: A comprehensive guide to methods and measures. Oxford, UK: OUP Press.

James, William. 1890. The principles of psychology (Vol.1). New York: Holt

Kruger, Jan Louis, Agnieszka Szarkowska and Izabela Krejtz. 2015. “Subtitles on the Moving Image: An Overview of Eye Tracking Studies”, Refractory: a Journal of Entertainment Media, 25.

Le Meur, Olivier and Baccino, Thierry. 2013. “Methods for comparing scanpaths and saliency maps: strengths and weaknesses.” Behavior research methods, 45(1), 251-266.

Magliano, Joseph P. and Jeffrey M. Zacks. 2011. “The Impact of Continuity Editing in Narrative Film on Event Segmentation.” Cognitive Science, 35(8), 1-29.

Mital, Parag K., Tim J. Smith, Robin Hill. and John M. Henderson. 2011. “Clustering of gaze during dynamic scene viewing is predicted by motion.” Cognitive Computation, 3(1), 5-24

Rayner, Keith. 1998. “Eye movements in reading and information processing: 20 years of research”. Psychological Bulletin, 124(3), 372-422.

Rayner, Keith, Tim J. Smith, George Malcolm and John M. Henderson, J.M. 2009. “Eye movements and visual encoding during scene perception.” Psychological Science, 20, 6-10.

Raz, Gal, Yael Jacob, Tal Gonen, Yonatan Winetraub, Tamar Flash, Eyal Soreq and Talma Hendler. 2014. “Cry for her or cry with her: context-dependent dissociation of two modes of cinematic empathy reflected in network cohesion dynamics.” Social cognitive and affective neuroscience, 9(1), 30-38.

Redmond, Sean, Jodi Sita and Kim Vincs. 2015. “Our Sherlockian Eyes: the Surveillance of Vision”, Refractory: a Journal of Entertainment Media, 25.

Robinson, Jennifer, Jane Stadler and Andrea Rassell. 2015. “Sound and Sight: An Exploratory Look at Saving Private Ryan through the Eye-tracking Lens”, Refractory: a Journal of Entertainment Media, 25.

Salt, Barry. 2009. Film Style and Technology: History and Analysis (Vol. 3rd). Totton, Hampshire, UK: Starword.

Sawahata, Yasuhito, Rajiv Khosla, Kazuteru Komine, Nobuyuki Hiruma, Takayuki Itou, Seiji Watanabe, Yuji Suzuki, Yumiko Hara and Nobuo Issiki. 2008. “Determining comprehension and quality of TV programs using eye-gaze tracking.” Pattern Recognition, 41(5), 1610-1626.

Smith, Murray. 2011. “Triangulating Aesthetic Experience”, paper presented at the annual Society for Cognitive Studies of the Moving Image conference, Budapest, June 8–11, 201

Smith, Tim J. 2006. An Attentional Theory of Continuity Editing. Ph.D., University of Edinburgh, Edinburgh, UK.

Smith, Tim J. 2012a. “The Attentional Theory of Cinematic Continuity”, Projections: The Journal for Movies and the Mind. 6(1), 1-27.

Smith, Tim J. 2012b. “Extending AToCC: a reply,” Projections: The Journal for Movies and the Mind. 6(1), 71-78

Smith, Tim J. 2013. “Watching you watch movies: Using eye tracking to inform cognitive film theory.” In A. P. Shimamura (Ed.), Psychocinematics: Exploring Cognition at the Movies. New York: Oxford University Press. pages 165-191

Smith, Tim J. 2014. “Audiovisual correspondences in Sergei Eisenstein’s Alexander Nevsky: a case study in viewer attention”. Cognitive Media Theory (AFI Film Reader), Eds. P. Taberham & T. Nannicelli.

Smith, Tim J., Jonathan Batten and Rachael Bedford. 2014. “Implicit detection of asynchronous audiovisual speech by eye movements.” Journal of Vision,14(10), 440-440.

Smith, Tim J., Dekker, T., Mital, Parag K., Saez De Urabain, I. R. & Karmiloff-Smith, A., In Prep. “Watch like mother: Motion and faces make infant gaze indistinguishable from adult gaze during Tot TV.”

Smith, Tim J. and John M. Henderson. 2008. “Edit Blindness: The relationship between attention and global change blindness in dynamic scenes”. Journal of Eye Movement Research, 2(2):6, 1-17.

Smith Tim J., Peter Lamont and John M. Henderson. 2012. “The penny drops: Change blindness at fixation.” Perception 41(4) 489 – 492

Smith, Tim J., Daniel Levin and James E. Cutting. 2012. “A Window on Reality: Perceiving Edited Moving Images.” Current Directions in Psychological Science. 21: 101-106

Smith, Tim J. and Parag K. Mital. 2013. “Attentional synchrony and the influence of viewing task on gaze behaviour in static and dynamic scenes”. Journal of Vision 13(8): 16.

Smith, Tim J. and Janet Y. Martin-Portugues Santacreu. Under Review. “Match-Action: The role of motion and audio in limiting awareness of global change blindness in film.”

Smith, Tim. J. and Murray Smith. In Prep. “The impact of expertise on eye movements during film viewing.”

Suckfull, Monika. 2000. “Film Analysis and Psychophysiology Effects of Moments of Impact and Protagonists”. Media Psychology2(3), 269-301.

Vilaro, Anna and Tim J. Smith. 2011. “Subtitle reading effects on visual and verbal information processing in films.” Published abstract In Perception. ECVP abstract supplement, 40. (p. 153). European Conference on Visual Perception. Toulousse, France.

Velichkovsky, Boris M., Sascha M. Dornhoefer, Sebastian Pannasch and Pieter J. A. Unema. 2001. “Visual fixations and level of attentional processing”. In Andrew T. Duhowski (Ed.), Proceedings of the International Conference Eye Tracking Research & Applications, Palm Beach Gardens, FL, November 6-8, ACM Press.

Wass, Sam V. and Tim J. Smith. In Press. “Visual motherese? Signal-to-noise ratios in toddler-directed television,” Developmental Science

Yarrow, Kielan, Patrick Haggard, Ron Heal, Peter Brown and John C. Rothwell. 2001. “Illusory perceptions of space and time preserve cross-saccadic perceptual continuity”. Nature, 414.

Zakay, Dan and Richard A. Block. 1996. Role of Attention in Time Estimation Processes. Time, Internal Clocks, and Movement. Elsevier Science.

 

Notes

[ii] An alternative take on eye tracking data is to divorce the data itself from psychological interpretation. Instead of viewing a gaze point as an index of where a viewer’s overt attention is focussed and a record of the visual input most likely to be encoded into the viewer’s long-term experience of the media, researchers can instead take a qualitative, or even aesthetic approach to the data. The gaze point becomes a trace of some aspect of the viewer’s engagement with the film. The patterns of gaze, its movements across the screen and the coordination/disagreement between viewers can inform qualitative interpretation without recourse to visual cognition. Such an approach is evident in several of the articles in this special issue (including Redmond, Sita, and Vincs, this issue; Batty, Perkins, and Sita, this issue). This approach can be interesting and important for stimulating hypotheses about how such patterns of viewing have come about and may be a satisfying endpoint for some disciplinary approaches to film. However, if researchers are interested in testing these hypotheses further empirical manipulation of the factors that are believed to be important and statistical testing would be required. During such investigation current theories about what eye movements are and how they relate to cognition must also be respected.

[iii] Although, one promising area of research is the use of pupil diameter changes as an index of arousal (Bradley, Miccoli, Escrig and Lang, 2008).

[iv] This technique has been used for decades by producers of TV advertisements and by some “pop” serials such as Hollyoaks in the UK (Thanks for Craig Batty for this observation).

[v] This trend in increasing pace and visual complexity of film is confirmed by statistical analyses of film corpora over time (Cutting, DeLong and Nothelfer, 2010) and has resulted in a backlash and increasing interest in “slow cinema”.

[vi] Other authors in this special issue may argue that taking a critical approach to gaze heatmaps without recourse to psychology allows them to embed eye tracking within their existing theoretical framework (such as hermeneutics). However, I would warn that eye tracking data is simply a record of how a relatively arbitrary piece of machinery (the eye tracking hardware) and associated software decided to represent the centre of a viewer’s gaze. There are numerous parameters that can be tweaked to massively alter how such gaze traces and heatmaps appear. Without understanding the psychology and the physiology of the human eye a researcher cannot know how to set these parameters, how much to trust the equipment they are using, or the data it is recording and as a consequence may over attribute interpretation to a representation that is not reliable.

[vii] https://theeyetribe.com/ (accessed 13/12/14). The EyeTribe tracker is $99 and is as spatially and temporally accurate (up to 60Hz sampling rate) as some science-grade trackers.

[viii] http://www.tobii.com/eye-experience/ (accessed 13/12/14). The Tobii EyeX tracker is $139, samples at 30Hz and is as spatially accurate as the EyeTribe although the EyeX does not give you as much access to the raw gaze data (e.g., pupil size and binocular gaze coordinates) as the EyeTribe.

 

Bio

Dr Tim J. Smith is a senior lecturer in the Department of Psychological Sciences at Birkbeck, University of London. He applies empirical Cognitive Psychology methods including eye tracking to questions of Film Cognition and has published extensively on the subject both in Psychology and Film journals.

 

From Subtitles to SMS: Eye Tracking, Texting and Sherlock – Tessa Dwyer

Abstract

As we progress into the digital age, text is experiencing a resurgence and reshaping as blogging, tweeting and phone messaging establish new textual forms and frameworks. At the same time, an intrusive layer of text, obviously added in post, has started to feature on mainstream screen media – from the running subtitles of TV news broadcasts to the creative portrayals of mobile phone texting on film and TV dramas. In this paper, I examine the free-floating text used in BBC series Sherlock (2010–). While commentators laud this series for the novel way it integrates text into its narrative, aesthetic and characterisation, it requires eye tracking to unpack the cognitive implications involved. Through recourse to eye tracking data on image and textual processing, I revisit distinctions between reading and viewing, attraction and distraction, while addressing a range of issues relating to eye bias, media access and multimodal redundancy effects.

Figure 1

Figure 1: Press conference in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Introduction

BBC’s Sherlock (2010–) has received considerable acclaim for its creative deployment of text to convey thought processes and, most notably, to depict mobile phone messaging. Receiving high-profile write-ups in The Wall Street Journal (Dodes, 2013) and Wired UK, this innovative representational strategy has been hailed an incisive reflection of our current “transhuman” reality and “a core element of the series’ identity” (McMillan 2014).[1] In the following discussion, I deploy eye tracking data to develop an alternate perspective on this phenomenon. While Sherlock’s on-screen text directly engages with the emerging modalities of digital and online technologies, it also borrows from more conventional textual tools like subtitling and captioning or SDH (subtitling for the deaf and hard-of-hearing). Most emphatically, the presence of floating text in Sherlock challenges the presumption that screen media is made to be viewed, not read. To explore this challenge in detail, I bring Sherlock’s inventive titling into contact with eye tracking research on subtitle processing, using insights from audiovisual translation (AVT) studies to investigate the complexities involved in processing dynamic text on moving-image screens. Bridging screen and translation studies via eye tracking, I consider recent on-screen text developments in relation to issues of media access and linguistic diversity, noting the gaps or blind spots that regularly infiltrate research frameworks. Discussion focuses on ‘A Study in Pink’ – the first episode of Sherlock’s initial season – which producer Sue Vertue explains was actually “written and shot last, and so could make the best use of onscreen text as additional script and plot points” (qtd in McMillan, 2014).

Texting Sherlock

Figure 2

Figure 2: Watson reads a text message in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

The phenomenon under investigation in this article is by no means easy to define. Already it has inspired neologisms, word mashes and acronyms including TELOP (television optical projection), ‘impact captioning’ (Sasamoto, 2014), ‘decotitles’ (Kofoed, 2011), ‘beyond screen text messaging’ (Zhang 2014) and ‘authorial titling’ (Pérez González, 2012). While slight differences in meaning separate such terms from one another, the on-screen text in Sherlock fits all. Hence, in this discussion, I alternate between them and often default to more general terms like ‘titling’ and ‘on-screen text’ for their wide applicability across viewing devices and subject matter. This approach preserves the terminological ambiguity that attaches to this phenomenon instead of seeking to solve it, finding it symptomatic of the rapid rate of technological development with which it engages. Whatever term is decided upon today could well be obsolete tomorrow. Additionally, as Rick Altman (2004: 16) notes in his ‘crisis historiography’ of silent and early sound film, the “apparently innocuous process of naming is actually one of culture’s most powerful forms of appropriation.” He argues that in the context of new technologies and the representational codes they engender, terminological variance and confusion signals an identity crisis “reflected in every aspect of the new technology’s socially defined existence” (19).

According to the write-ups, phone messaging is the hero of BBC’s updated and rebooted Sherlock adaptation. Almost all the press garnered around Sherlock’s on-screen text links this strategy to mobile phone ‘texting’ or SMS (short messaging service). Reporting on “the storytelling challenges of a world filled with unglamorous smartphones, texting and social media”, The Wall Street Journal’s Rachel Dodes (2013) credits Sherlock with solving this dilemma and establishing a new convention for depicting texting on the big screen, creatively capturing “the real world’s digital transformation of everyday life.” For Mariel Calloway (2013), “Sherlock is honest about the role of technology and social media in daily life and daily thought… the seamless way that text messages and internet searches integrate into our lives.” Wired’s Graeme McMillan (2014) ups the ante, naming Sherlock a “new take” on “television drama as a whole” due precisely to its on-screen texting technique that sets it apart from other “tech-savvy shows out there”. McMillan continues, that “as with so many aspects of Sherlock, there’s an element of misdirection going on here, with the fun, eye-catching slickness of the visualization distracting from a deeper commentary the show is making about its characters relationship with technology – and, by extension, our own relationship with it, as well.”

As this flurry of media attention makes clear, praise for Sherlock’s on-screen text or texting firmly anchors this strategy to technology and its newly evolving forms, most notably the iPhone or smartphone. Appearing consistently throughout the series’ three seasons to date, on-screen text in Sherlock occurs in a plain, uniform white sans-serif font that appears unadorned over the screen image, obviously added during post-production. This text is superimposed, pure and simple, relying on neither text bubbles nor coloured boxes nor sender ID’s to formally separate it from the rest of the image area. As Michele Tepper (2011) eloquently notes, by utilising text in this way, Sherlock “is capturing the viewer’s screen as part of the narrative itself”:

It’s a remarkably elegant solution from director Paul McGuigan. And it works because we, the viewing audience, have been trained to understand it by the last several years of service-driven, multi-platform, multi-screen applications. Last week’s iCloud announcement is just the latest iteration of what can happen when your data is in the cloud and can be accessed by a wide range of smart-enough devices. Your VOIP phone can show caller ID on your TV; your iPod can talk to both your car and your sneakers; Twitter is equally accessible via SMS or a desktop application. It doesn’t matter where or what the screen is, as long as it’s connected to a network device. … In this technological environment, the visual conceit that Sherlock’s text message could migrate from John Watson’s screen to ours makes complete and utter sense.

Unlike on-screen text in Glee (Fox, 2009–), for instance (see Fig. 3), that is used only occasionally in episodes like ‘Feud’ (Season 4, Ep 16, March 14, 2013), Sherlock flaunts its on-screen text as signature. Its consistently interesting textual play helps to give the series cohesion. Yet, just as it aids in characterisation, helps to progress the narrative, and binds the series as a whole, it also, necessarily, remains at somewhat of a remove, as an overtly post-production effect.

Figure 3

Figure 3: Ryder chats online in ‘Feud’, Glee (2013), Episode 16, Season 4.

While Tepper (2011) explains how Sherlock’s “disembodied” (Banks, 2014) texting ‘makes sense’ in the age of cross-platform devices and online clouds, this argument falters when the on-screen text in question is less overtly technological. The extradiegetic nature of this on-screen text – so obviously a ‘post’ effect – is brought to the fore when it is used to render thoughts and emotions rather than technological interfacing. In ‘A Study in Pink’, a large proportion of the text that pops up intermittently on-screen functions to represent Sherlock’s interiority, not his Internet prowess. In concert with camera angles and “microscopic close-ups”, it elucidates Sherlock’s forensic “mind’s eye” (Redmond, Sita and Vincs, this issue), highlighting clues and literally spelling out their significance (see Figs. 4 and 5). The fact that these human-coded moments of titling have received far less attention in the press than those that more directly index new technologies is fascinating in itself, revealing the degree to which praise for Sherlock’s on-screen text is invested in ideas of newness and technological innovation – underlined by the predilection for neologisms.

Figure 4

Figures 4: Sherlock examines the pink lady’s ring in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Figure 5

Figures 5: Sherlock examines the pink lady’s ring in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Of course, even when not attached to smartphones or data retrieval, Sherlock’s deployment of on-screen text remains fresh, creative and playful and still signals perceptual shifts resulting from technological transformation. Even when representing Sherlock’s thoughts, text flashes on screen manage to recall the excesses of the digital, when email, Facebook and Twitter ensconce us in streams of endlessly circulating words, and textual pop-ups are ubiquitous. Nevertheless, the blinkered way in which Sherlock’s on-screen text is repeatedly framed as, above all, a means of representing mobile phone texting functions to conceal some of its links to older, more conventional forms of titling and textual intervention, from silent-era intertitles to expository titles to subtitles. By relentlessly emphasising its newness, much discussion of Sherlock’s on-screen text overlooks links to a host of related past and present practices. Moreover, Sherlock’s textual play actually invites a rethinking of these older, ongoing text-on-screen devices.

Reading, Watching, Listening

As Szarkowska and Kruger (this issue) explain, research into subtitle processing builds upon earlier eye tracking studies on the reading of static, printed text. They proceed to detail differences between subtitle and ‘regular’ reading, in relation to factors like presentation speed, information redundancy, and sensory competition between different multimodal channels. Here, I focus on differences between saccadic or scanning movements and fixations, in order to compare data across the screen and translation fields. During ‘regular’ reading (of static texts) average saccades last 20 to 50 milliseconds (ms) while fixations range between 100 and 500ms, averaging 200 to 300ms (Rayner, 1998). Referencing pioneering studies into subtitle processing by Géry d’Ydewalle and associates, Szarkowska et al. (2013: 155) note that “when reading film subtitles, as opposed to print, viewers tend to make more regressions” and fixations tend to be shorter. Regressions occur when the eye returns to material that has already been read, and Rayner (1998: 393) finds that slower readers (of static text) make more regressions than faster readers. A study by d’Ydewalle and de Bruycker (2007: 202) found “the percentage of regressions in reading subtitles was globally, among children and adults, much higher than in normal text reading.” They also report that mean fixation durations in the subtitles was shorter, at 178 ms (for adults) and explain that subtitle regressions (where the eye travels back across words already read) can be partly explained by the “considerable information redundancy” that occurs when “[s]ubtitle, soundtrack (including the voice and additional information such as intonation, background noise, etc.), and image all provide partially overlapping information, eliciting back and forth shifts with the image and more regressive eye-movements” (202).

What happens to saccades and fixations when image processing is brought into the mix? When looking at static images, average fixations last 330 ms (Rayner, 1998). This figure is slightly longer than average fixations during regular reading and longer again than average subtitle fixations. Szarkowska and Kruger (this issue) note that “reading requires many successive fixations to extract information whereas looking at a scene requires fewer, but longer fixations” that tend to be more exploratory or ambient in nature, taking in a greater area of focus. In relation to moving-images, Smith (2013: 168) finds that viewers take in roughly 3.8% of the total screen area during an average length shot. Peripheral processing is at play but “is mostly reserved for selecting future saccade targets, tracking moving targets, and extracting gist about scene category, layout and vague object information”. In thinking about these differences in regular reading behaviour, screen viewing, and subtitle processing, it is noticeable that with subtitles, distinctions between fixations and saccades are less clear-cut. While saccades last between 20 and 50ms, Smith (2013: 169) notes that the smallest amount of time taken to perform a saccadic eye movement (taking into account saccadic reaction time) is 100-130ms. Recalling d’Ydewalle and de Bruycker’s (2007: 202) finding that fixations during subtitle processing last around 178ms, it would seem that subtitle conditions blur the boundaries somewhat between saccades and fixations, scanning and reading.

Interestingly, studies have also shown that the processing of two-line subtitles involves more regular word-by-word reading than for one-liners (D’Ydewalle and de Bruycker, 2007: 199). D’Ydewalle and de Bruycker (2007: 199) report, for instance, that more words are skipped and more regressions occur for one-line subtitles than for two-line subtitles. Two-line subtitles result in a larger proportion of time being spent in the subtitle area, and occasion more back-and-forth shifts between the subtitles and the remaining image area (201). This finding suggests that the processing of one-line subtitles differs considerably from regular reading behaviour. D’Ydewalle and de Bruycker (2007: 202) surmise that the distinct way in which one-line subtitles are processed relates to a redundancy effect caused by the multimodal nature of screen media. Noting how one-line subtitles often convey short exclamations and outcries, they suggest that a “standard one-line subtitle generally does not provide much more information than what can already be extracted from the picture and the auditory message.” They conclude that one-line subtitles occasion “less reading” than two-line subtitles (202). Extrapolating further, I posit that the routine overlapping of information that occurs in subtitled screen media blurs lines between reading and watching. One-line subtitles are ‘read’ irregularly and partly blind – that is, they are regularly skipped and processed through saccadic eye movements rather than fixations.

This suggestion is supported by data on subtitle skipping. Szarkowska and Kruger (this issue) find that longer subtitles containing frequently used words are easier and quicker to process than shorter subtitles containing low-frequency words. Hence, they conclude that cognitive load relates more to word familiarity than quantity, something that is overlooked in many professional subtitling guidelines. This finding indicates that high-frequency words are processed ‘differently’ in subtitling than in static text, in a manner more akin to visual recognition or scanning than reading. Szarkowska and Kruger find that high-frequency words in subtitles are often skipped. Hence, as with one-line subtitles, high-frequency words are, to a degree, processed blind, possibly through shape recognition and mapping more than durational focus. In relation to other types of on-screen text, such as the short, free-floating type that characterises Sherlock, it seems entirely possible that this innovative mode of titling may just challenge distinctions between text and image processing. While commentators laud this series for the way it integrates on-screen text into its narrative, style and characterisation, eye tracking is required to unpack the cognitive implications of Sherlock’s text/image morph.

The Pink Lady

Figure 6

Figure 6: Letters scratched into the floor in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

Sherlock producer Vertue refers to the pink lady scene in ‘A Study in Pink’ as particularly noteworthy for its “text all around the screen”, referring to it as the “best use” of on-screen text in the series (qtd in McMillan, 2014). In this scene, a dead woman dressed in pink lies face first on the floor of a derelict building into which she has painstakingly etched a word or series of letters (‘Rache’) with her fingernails. As Sherlock investigates the crime scene, forensics officer Anderson interrupts to explain that ‘Rache’ is the German word for ‘revenge’. The German-to-English translation pops up on screen (see Fig. 6), and this time Sherlock sees it too. This superimposed text, so obviously laid over the image, oversteps its surface positioning to enter Sherlock’s diegetic space, and we next view it backwards, from Sherlock’s point of view, not ours (see Fig. 7). After an exasperated eye roll that signals his disregard for Anderson, Sherlock dismisses this textual intervention and we watch it swirl into oblivion. Here, on-screen text is at once both inside and outside the narrative, diegetic and extra-diegetic, informative and affecting. In this way it self-reflexively draws attention to the show’s narrative framing, demonstrating its complexity as distinct diegetic levels merge.

Figure 7

Figure 7: Sherlock sees on-screen text in ‘A Study in Pink’, Sherlock (2010), Episode 1, Season 1.

For Carol O’Sullivan (2011), when on-screen text affords this type of play between the diegetic and extra-diegetic it functions as an “extreme anti-naturalistic device” (166) that she unpacks via Gérard Genette’s notion of narrative metalepsis (164). Detailing numerous examples of humourous, formally transgressive diegetic subtitles, such as those found in Annie Hall (Woody Allen, 1977) (Fig. 8), O’Sullivan points to their metatextual function, referring to them as “metasubtitles” (166) that implicitly comment on the limits and nature of subtitling itself. When Sherlock’s on-screen titles oscillate between character and viewer point-of-view shots, they too become metatextual, demonstrating, in Genette’s terms, “the importance of the boundary they tax their ingenuity to overstep in defiance of verisimilitude – a boundary that is precisely the narrating (or the performance) itself: a shifting but sacred frontier between two worlds, the world in which one tells, the world of which one tells” (qtd in O’Sullivan 2011: 165). Moreover, for O’Sullivan, “all subtitles are metatextual” (166) necessarily foregrounding their own act of mediation and interpretation. Specifically linking such ideas to Sherlock, Luis Perez Gonzalez (2012: 18) notes how “the series creators incorporate titles that draw attention to the material apparatus of filmic production”, thereby creating an complex alienation-attraction effect “that shapes audience engagement by commenting upon the diegetic action and disrupting conventional forms of semiotic representation, making viewers consciously work as co-creators of media content.”

Figure 8

Figure 8: Subtitled thoughts in the balcony scene, Annie Hall (1977).

Eye Bias

One finding from subtitle eye tracking research particularly pertinent to Sherlock is the notion that on-screen text causes eye bias. This was established in various studies conducted by d’Ydewalle and associates, which found that subtitle processing is largely automatic and obligatory. D’Ydewalle and de Bruycker (2007: 196) state:

Paying attention to the subtitle at its presentation onset is more or less obligatory and is unaffected by major contextual factors such as the availability of the soundtrack, knowledge of the foreign language in the soundtrack, and important episodic characteristics of actions in the movie: Switching attention from the visual image to “reading” the subtitles happens effortlessly and almost automatically (196).

This point is confirmed by Bisson et al. (2014: 399) who report that participants read subtitles even in ‘reversed’ conditions – that is, when subtitles are rendered in an unfamiliar language and the screen audio is fully comprehensible (in the viewers’ first language) (413). Again, in intralingual or same-language subtitling – when titles replicate the language spoken on screen –hearing audiences still divert to the subtitle area (413). These findings indicate that viewers track subtitles irrespective of language or accessibility requirements. In fact, the tracking of subtitles overrides function. As Bisson et al. (413) surmise, “the dynamic nature of the subtitles, i.e., the appearance and disappearance of the subtitles on the screen, coupled with the fact that the subtitles contained words was enough to generate reading behavior”.

Szarkowska and Kruger (this issue) reach a similar conclusion, explaining eye bias towards subtitles in terms of both bottom-up and top-down impulses. When subtitles or other forms of text flash up on screen, they affect a change to the scene that automatically pulls our eyes. The appearance and disappearance of text on screen is registered in terms of motion contrast, which according to Smith (2013: 176), is the “critical component predicting gaze behavior”, attaching to small movements as well as big. Additionally, we are drawn to words on screen because we identify them as a ready source of relevant information, as found in Batty et al. (forthcoming). Analysing a dialogue-free montage sequence from animated feature Up (Pete Docter, 2009), Batty et al. found that on-screen text in the form of signage replicates in miniature how ‘classical’ montage functions as a condensed form of storytelling aiming for enhanced communication and exposition. They suggest that montage offers a rhetorical amplification of an implicit intertitle, thereby alluding to the historical roots of text on screen while underlining its narrative as well as visual salience. One frame from the montage sequence focuses in close-up on a basket containing picnic items and airline tickets (see Fig. 9). Eye tracking tests conducted on twelve participants indicates a high degree of attentional synchrony in relation to the text elements of the airline ticket on which Ellie’s name is printed. Here, text provides a highly expedient visual clue as to the narrative significance of the scene and viewers are drawn to it precisely for its intertitle-like, expository function, highlighting the top-down impulse also at play in the eye bias caused by on-screen text.

Figure 9

Figure 9: Heat map showing collective gaze weightings during the montage sequence in Up (2009).

In this image from Up, printed text appears in the centre of the frame and, as Smith (2013: 178) elucidates, eyes are instinctively drawn towards frame centre, a finding backed up by much subtitle research (see Skarkowska and Kruger, this issue). However, eye tracking results on Sherlock conducted by Redmond, Sita and Vincs (this issue) indicate that viewers also scan static text when it is not in the centre of the frame. In an establishing shot of 221B Baker Street from the first episode of Sherlock’s second season, ‘A Scandal in Belgravia’, viewers track static text that borders the frame across its top and right hand sides, again searching for information (See Fig. 10). Hence, the eye-pull exerted by text is noticeable even in the absence of movement, contrast and central framing. In part, viewers are attracted to text simply because it is text – identified as an efficient communication mode that facilitates speedy comprehension (see Lavaur, 2011: 457).

Figure 10

Figure 10: Single viewer gaze path for ‘A Scandal in Belgravia’, Sherlock (2012), Episode 1, Season 2.

Distraction/Attraction

What do these eye tracking results across screen and translation studies tell us about Sherlock’s innovative use of on-screen text and texting? Based on the notion that text on screen draws the eye in at least dual ways, due to both its dynamic/contrastive nature and its communicative expediency, we can surmise that for Sherlock viewers, on-screen text is highly visible and more than likely to be in that 3.8% of the screen on which they will focus at any one point in time (see Smith, 2013: 168). The marked eye bias caused by text on screen is further accentuated in Sherlock by the freshness of its textual flashes, especially for English-speaking audiences given the language hierarchies of global screen media (see Acland 2012, UNESCO 2013). The small percentage of foreign-language media imported into most English-speaking markets tends to result in a lack of familiarity with subtitling beyond niche audience segments. For those unfamiliar with subtitling or captioning, on-screen text appears particularly novel. Additionally, as explored, floating TELOPs in Sherlock attract attention due to the complex functions they fulfil, providing narrative and character clues as well as textual and stylistic cohesion. As Tepper (2011) points out, in the first episode of the series, viewers are introduced to Sherlock’s character via text, before seeing him on screen. “When he texts the word ‘Wrong!’ to DI Lestrade and all the reporters at Lestrade’s press conference,” notes Tepper, “the technological savvy and the imperiousness of tone tell you most of what you need to know about the character.”

There seems no doubt that on-screen text in Sherlock attracts eye movement, and that it therefore distracts from other parts of the image. One question then that immediately presents itself is why Sherlock’s textual distractions are tolerated – even celebrated – to a far greater extent than other, more conventional or routine forms of titling like subtitles and captions. While Sherlock’s on-screen text is praised as innovative and incisive, interlingual subtitling and SDH are criticised by detractors for the way in which they supposedly force viewers to read rather than watch, effectively transforming film into “a kind of high-class comic book with sound effects” (Canby, 1983).[2] Certainly, differences in scale affect such attitudes and the quantitative variance between post-subtitles (produced for distribution only) and authorial or diegetic titling (as seen in Sherlock) is pronounced.[3] However, eye tracking research on subtitle processing indicates that, on the whole, viewers easily accommodate the increased cognitive load it presents. Although attentional splitting occurs, leading to an increase in back-and-forth shifts between the subtitles and the rest of the image area (Skarkowska and Kruger, this issue), viewers acclimatise by making shorter fixations than in regular reading and by skipping high-frequency words and subtitles while still managing to register meaning (see d’Ydewalle and de Bruycker, 2007: 199). In this way, subtitle processing reveals many differences to reading of static text, and approximates techniques of visual scanning. Bearing these findings in mind, I propose it is more accurate to see subtitling as transforming reading into viewing and text into image, rather than vice versa.

Situating Sherlock in relation to a range of related TELOP practices across diverse TV genres (such as game shows, panel shows, news broadcasting and dramas) Ryoko Sasamoto (2014: 7) notes that the additional processing effort caused by on-screen text is offset by its editorial function.[4] TELOPs are often deployed by TV producers to guide interpretation and ensure comprehension by selecting and highlighting information deemed most relevant. This suggestion is backed up by research by Rei Matsukawa et al. (2009), which found that the information redundancy effect caused by TELOPs facilitates understanding of TV news. For Sasamoto (2014: 7), ‘impact captioning’ highlights salient information in much the same way as voice intonation or contrastive stress. It acts as a “written prop on screen” enabling “TV producers to achieve their communicative aims… in a highly economical manner” (8). Focusing on Sherlock specifically, Sasamoto suggests that its captioning provides “a route for viewers into complex narratives” (9). Moreover, as Szarkowska and Kruger (this issue) note, in static reading conditions, “longer fixations typically reflect higher cognitive load.” Consequently, the shorter fixations that characterise subtitle viewing supports the contention that on-screen text processing is eased by its expedient, editorial function and by redundancy effects resulting from its multimodality.

Switched On

Another way in which Sherlock’s text and titling innovations extend beyond mobile phone usage was exemplified in July 2013 by a promotional campaign that promised viewers a ‘sneak peak’ at a yet-to-be-released episode title, requiring them to find and piece together a series of clues. In true Sherlockian style, the clues were well hidden, only visible to viewers if they switched on closed-captioning or SDH available for deaf and hard-of-hearing audiences. With this device turned on, viewers encountered intralingual captioning along the bottom of their screen and additionally, individually boxed letters that appeared top left (see Figs. 11 and 12). Viewers needed to gather all these single letter clues in order to deduce the episode title: ‘His Last Vow’. According to the ‘I Heart Subtitles’ blog (July 16, 2013), in doing so, Sherlock once again displayed its ability to “think outside the box and consider all…options”. It also cemented its commitment to on-screen text in various guises, and effectively gave voice to an audience segment typically disregarded in screen commentary and analysis. Through this highly unusual, cryptic campaign, Sherlock alerted viewers to more overtly functional forms of titling, and intimated points of connection between language, textual intervention and access.

Figure 11

Figures 11: Boxed letter clues (top left of frame) that appeared when closed captioning was switched on, during a re-run of ‘A Scandal in Belgravia’, Sherlock (2012), Episode 1, Season 2.

Figure 12

Figures 12: Boxed letter clues (top left of frame) that appeared when closed captioning was switched on, during a re-run of ‘A Scandal in Belgravia’, Sherlock (2012), Episode 1, Season 2.

Conclusion

On-screen text invites a rethinking of the visual, expanding its borders and blurring its definitional clarity. Eye tracking research demonstrates that moving text on screens is processed differently to static text, affected by a range of factors issuing from its multimodal complexity. Sherlock subtly signals such issues through its playful, irreverent deployment of text, which enables viewers to directly access Sherlock’s thoughts and understand his reasoning, while also distancing them, asking them to marvel at his ‘millennial’ technological prowess (Stein and Busse, 2012: 11) while remaining self-consciously aware of his complex narrative framing as it flips inside out, inviting audiences to watch themselves watching. Such diegetic transgression is yet to be mapped through eye tracking, intimating a profitable direction for future studies. To date, data on text and image processing demonstrates how on-screen text attracts eye movement and hence, it can be inferred that it distracts from other parts of the image area. Yet, despite rendering more of the image effectively ‘invisible’, text in the form of TELOPs are increasingly prevalent in news broadcasts, current affairs panel shows (when audience text messages are displayed) and, most notably, in Asian TV genres where they are now a “standard editorial prop” featured in many dramas and game shows (Sasamoto, 2014: 1). In order to take up the challenge presented by such emerging modes of screen address, research needs to move beyond surface assessments of the attraction/distraction nexus. It is the very attraction to TELOP distraction that Sherlock – via eye tracking – brings to the fore.

 

References

Acland, Charles. 2012. “From International Blockbusters to National Hits: Analysis of the 2010 UIS Survey on Feature Film Statistics.” UIS Information Bulletin 8: 1-24. UNESCO Institute for Statistics.

Altman, Rick. 2004. Silent Film Sound. New York: Columbia University Press.

Banks, David. 2012. “Sherlock: A Perspective on Technology and Story Telling.” Cyborgology, January 25. Accessed October 9, 2014.

Batty, Craig, Adrian Dyer, Claire Perkins and Jodi Sita (forthcoming). “Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative.” In Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship, edited by Carrie Lynn D. Reinhard and Christopher J. Olson. London and New York: Bloomsbury.

Bennet, Alannah. 2014. “From Sherlock to House of Cards: Who’s Figured Out How to Translate Texting to Film.” Bustle, August 18. Entertainment. Accessed October 9. http://www.bustle.com/articles/36115-from-sherlock-to-house-of-cards-whos-figured-out-how-to-translate-texting-to-film/image/36115.

Biedenharn, Isabella. 2014. “A Brief Visual History of On-Screen Text Messages in Movies and TV.Flavorwire, April 24. Accessed October 13.

Bisson, Marie-Jos´ee, Walter J. B. Van Heuven, Kathy Conklin And Richard J. Tunney. 2014. “Processing of native and foreign language subtitles in films: An eye tracking study.” Applied Psycholinguistics 35: 399–418. Accessed October 13, 2014. doi: 10.1017/S0142716412000434.

Calloway, Mariel. 2013. “The Game is On(line): BBC’s ‘Sherlock’ in the Age of Social MediaMariel Calloway, March 8. Accessed October 14, 2014.

Canby, Vincent. 1983. “A Rebel Lion Breaks Out.” New York Times, March 27, 21.

Dodes, Rachel. 2013. “From Talkies to Texties.” Wall Street Journal, April 4, Arts and Entertainment Section. Accessed October 13, 2014.

d’Ydewalle, Géry and Wim De Bruycker, 2007. “Eye movements of children and adults while reading television subtitles.” European Psychologist 12 (3): 196-205.

Kofoed, D. T. 2011. “Decotitles, the Animated Discourse of Fox’s Recent Anglophonic Internationalism.” Reconstruction 11 (1). Accessed October 5, 2012.

Lavaur, Jean-Marc and Dominic Bairstow. 2011. “Languages on the screen: Is film comprehension related to the viewers’ fluency level and to the language in the subtitles?” International Journal of Psychology 46 (6): 455-462. doi: 10.1080/00207594.2011.565343.

McMillan, Graeme. 2014. “Sherlock’s Text Messages Reveal Our TranshumanismWired UK, February 3. Accessed October 14.

Matsukawa, Rei, Yosuke Miyata and Shuichi Ueda. 2009. “Information Redundancy Effect on Watching TV News: Analysis of Eye Tracking Data and Examination of the Contents.” Literary and Information Science 62: 193-205.

O’Sullivan, Carol. 2011. Translating Popular Film. Basingstoke and New York: Palgrave Macmillan.

Pérez González, Luis. 2013. “Co-Creational Subtitling in the Digital Media: Transformative and Authorial Practices.” International Journal of Cultural Studies 16 (1): 3-21. Accessed September 25, 2014. doi: 10.1177/1367877912459145.

Rayner, K. 1998. “Eye Movements in Reading and Information Processing: 20 Years of Research.” Psychological Bulletin 124: 372-422.

Redmond, Sean, Jodi Sita and Kim Vincs. 2015. “Our Sherlockian Eyes: The Surveillance of VisionRefractory: a Journal of Entertainment Media, 25.

Romero-Fresco, Pablo. 2013. “Accessible filmmaking: Joining the dots between audiovisual translation, accessibility and filmmaking.” JoSTrans: The Journal of Specialised Translation 20: 201-23. Accessed September 20, 2014.

Sasamoto, Ryoko. 2014. “Impact caption as a highlighting device: Attempts at viewer manipulation on TV.” Discourse, Context and Media 6: 1-10. Accessed September 18 (Article in Press). doi: 10.1016/j.dcm.2014.03.003.

Schrodt, Paul. 2013. “This is How to Shoot Text MessagingEsquire, February 4. The Culture Blog. Accessed October 13, 2014.

Smith, Tim J. 2013. “Watching You Watch Movies: Using Eye Tracking to Inform

Cognitive Film Theory” in Psychocinematics: Exploring Cognition at the Movies, edited by Arthur P. Shimamura, 165-91. Oxford and New York: Oxford University Press. Accessed October 7, 2014. doi: http://dx.doi.org/10.1093/acprof:oso/9780199862139.001.0001.

Stein, Louise Ellen and Kristina Busse. 2012. “Introduction: The Literary, Televisual and Digital Adventures of the Beloved Detective.” In Sherlock and Transmedia Fandom: Essays on the BBC Series, edited by Louise Ellen Stein and Kristina Busse, 9-24. Jefferson: McFarland and Company.

Szarkowska, Agnieszka et. al. 2013. “Harnessing the Potential of Eye-Tracking for Media Accessibility.” in Translation Studies and Eye-Tracking Analysis, edited by Sambor Grucza, Monika Płużyczka and Justyna Zając, 153-83. Frankfurt am Mein: Peter Lang.

Szarkowska, Agnieszka and Jan Louis Kruger. 2015. “Subtitles on the Moving Image: An Overview of Eye Tracking Studies.” Refractory: a Journal of Entertainment Media, 25.

Tepper, Michele. 2011. “The Case of the Travelling Text Message.” Interactions Everywhere, June 14. Accessed October 14, 2014.

UNESCO. 2013. “Feature Film Diversity”, UIS Fact Sheet 24, May. Accessed October 3, 2014.

Zhang, Sarah. 2014. “How Hollywood Figured Out A Way To Make Texting In Movies Look Less Dumb.Gizmodo, August 18. Accessed August 19.

Zhou, Tony. 2014. “A Brief Look at Texting and the Internet in Film”. Video Essay, Every Frame a Painting, August 15. Accessed August 19.

 

List of Figures

 

 

Notes

[1] While some commentators point out that Sherlock was by no means the first to depict text messaging in this way – as floating text on screen – it is this series more than any other that has brought this phenomenon into the limelight. Other notable uses of on-screen text to depict mobile phone messaging occur in films All About Lily Chou-Chou (Iwai, 2001), Disconnect (Rubin, 2013), The Fault in Our Stars (Boone, 2014), LOL (Azuelos, 2012), Non-Stop (Collet-Serra, 2014), Wall Street: Money Never Sleeps (Stone, 2010), and in TV series Glee (Fox, 2009–), House of Cards (Netflix, 2013–), Hollyoaks (Channel 4, 1995–), Married Single Other (ITV, 2010) and Slide (Fox8, 2011). For discussion of some ‘early adopters’, see Biendenharn 2014.

 

Notes

[2] Notably, in this New York Times piece, Canby (1983) actually defends subtitling against this charge, and advocates for subtitling over dubbing.

[3] On distinctions between post-subtitling and pre-subtitling (including diegetic subtitling), see O’Sullivan (2011).

[4] According to Sasamoto (2014: 1), “the use of OCT [Open Caption Telop] as an aid for enhanced viewing experience originated in Japan in 1990.”

 

Bio

Dr Tessa Dwyer teaches Screen Studies at the University of Melbourne, specialising in language politics and issues of screen translation. Her publications have appeared in journals such as The Velvet Light Trap, The Translator and The South Atlantic Quarterly and in a range of anthologies including B is for Bad Cinema (2014), Words, Images and Performances in Translation (2012) and the forthcoming Locating the Voice in Film (2016), Contemporary Publics (2016) and the Routledge Handbook of Audiovisual Translation (2017). In 2008, she co-edited a special issue of Refractory on split screens. She is a member of the ETMI research group and is currently writing a book on error and screen translation.

How We Came To Eye Tracking Animation: A Cross-Disciplinary Approach to Researching the Moving Image – Craig Batty, Claire Perkins, & Jodi Sita

Abstract

In this article, three researchers from a large cross-disciplinary team reflect on their individual experiences of a pilot study in the field of eye tracking and the moving image. The study – now concluded – employed a montage sequence from the Pixar film Up (2009) to determine the impact of narrative cues on gaze behaviour. In the study, the researchers’ interest in narrative was underpinned by a broader concern with the interaction of top-down (cognitive) and bottom-up (salient) factors in directing viewers’ eye movements. This article provides three distinct but interconnected reflections on what the aims, process and results of the pilot study demonstrate about how eye tracking the moving image can expand methods and knowledge across the three disciplines of screenwriting, screen theory and eye tracking. It is in this way both an article about eye tracking, animation and narrative, and also a broader consideration of cross-disciplinary research methodologies.

 

Introduction

Over the past 18 months, a team of cross-disciplinary researchers has undertaken a pilot eye tracking and the moving image study that has sought to understand where spectators look when viewing animation.[i] The original study employed eye tracking methods to record the gaze of 12 subjects. It used a Tobii X120 (Tobii Technology, 2005) remote eye tracking device which allowed viewers to watch the animation sequence on a widescreen PC monitor at 25 frames per second, with sound. The eye tracker pairs the movements of the eye over the screen with the stimuli being viewed by the participant. For each scene viewed, the researchers selected areas of interest; and for these areas, all of the gaze data, including the number and duration of each fixation, was collected and analysed.

Using a well-known montage sequence from the Pixar film Up! (2009), this pilot study focussed on narrative with the aim of discerning whether story cues were instrumental in directing spectator gaze. Focussing on narrative seemed to be useful in that as well as being an original line of enquiry in the eye tracking context, it also offered a natural connection between each of our disciplines and research experiences. The study did not take into account emotional and physiological responses from its participants as a way of discerning their narrative comprehension. Nevertheless, what we found from our data was that characters (especially their faces), key (narrative) objects and visual/scenic repetition seemed to be core factors in determining where they looked.[ii]

In the context of a montage sequence that spans around 60 years of story time, in which the death of the protagonist’s wife sets up the physical and emotional stakes of the rest of the film, it was clear that narrative meaning relating to a character’s journey/arc is important to viewers, more so (in this study) than peripheral action or visual style, for example. With regards to animation specifically, a form ‘particularly equipped to play out narratives that solicit […] emotions because of its capacity to illustrate and enhance interior states, and to express feeling that is beyond the realms of words to properly capture’ (Wells, 2007: 127), the highly controlled nature of the sequence from which the data was drawn seems to suggest that animation embraces narrative techniques fully to control viewer attention.

In this article, three researchers from the team – A, a screenwriter, B, a screen scholar and C, an eye tracking neuroscientist – discuss the approaches they took to conducting this study. Each of us came to the project armed with different expertise, different priorities and a different set of expectations for what we might find out, which we could then take back to our individual disciplines. In this article, then, we purposely use three voices as way of teasing out our understandings before, during and after the study, with the aim of better understanding the potential for cross-disciplinary research in this area. Although other studies in eye tracking and the moving image have been undertaken and reported on, we suggest that using animation with a strongly directed narrative as a test study provides new information. Furthermore, few other studies to date have brought together traditional and creative practice researchers in this way.

What we present, then, is a series of interconnected discussions that draw together ideas from each researcher’s community of thought and practice, guided by the overriding question: how did this study embrace methodological originality and yield innovative findings that might be important to the disciplines of eye tracking and moving image studies? We present these discussions in the format of individual reflections, as a way of highlighting each researcher’s contributions to the study, and in the hope that others will see the potential of disciplinary knowledge in a study such as this one.

How ‘looking’ features in our disciplines, and what we might expect to ‘see’

Researcher A: ‘Looking’ in screenwriting means two things: seeing and reflecting on. By this I mean that a viewer looks at the screen to see what is happening, whilst at the same time reflecting on what they are looking at from on a personal, cultural and/or political level. Some screenwriters focus on theme from the outset: on what they want their work to ‘say’ (see Batty, 2013); some screenwriters focus on plot: on what viewers will see (action) (see Vogler, 2007). What connects these is character. In Aristotelian terms, a character does and therefore is (Aristotle, 1996); for Egri, a character is and therefore does (Egri, 2004). The link here is that what we see on the screen (action) is always performed by a character, meaning that through a process of agency, actions are given meaning, feeding into the controlling theme(s) of the text. In this way, looking at – or seeing – is tied closely to understanding and the feelings that we bring to a text. As Hockley (2007) says, viewers are sutured into the text on an emotional level, connecting them and the text through the psychology of story space.

What we ‘see’, then, is meaning. In other words, we do not just see but we also feel. We look for visual cues that help us to understand the narrative unfolding before our eyes. With sound used to point to particular visual aspects and heighten our emotional states, we bestow energy and emotion in the visuality of the screen, in the hope that we will arrive at an understanding. As this study has revealed, examples include symbolic objects in the frame (the adventure book; the savings jar; the picture of Paradise Falls) that have narrative value in screenwriting because of the meaning they possess (Batty and Waldeback, 2008: 52-3). By seeing these objects repeated throughout the montage, we understand what they mean (to the characters and to the story) and glean a sense of how they will re-appear throughout the rest of the film as a way of representing the emotional space of the story.

Landscape is also something we see, though this is always in the context of the story world (see Harper and Rayner, 2010; Stadler, 2010). In other words, where is this place? What happens here? What cannot happen here? Characters belong to a story world, and therefore landscape also helps us to understand the situations in which we find them. This, again, draws us back to action, agency and theme: when we see landscape, we are in fact understanding why the screenwriter chose put their characters – and us, the audience – there in the first place.

Researcher B: In screen theory, looking is never just looking – never innocent and immediate. The act of looking is the gateway to the experience and knowledge of what is seen on screen, but also of how that encounter reflects the world beyond the screen and our place within it. Looking is over determined as gazing, knowing and being, endlessly charged by the coincidence of eye and I and of real and reel. Psychoanalytic theory imagines the screen as mirror and our identity as a spectatorial effect of recognizing ourselves in the characters and situations that unfold upon it, however refracted. Reception studies seeks out how conversely real individuals encounter content on screen, and how meaning sparks in that meeting—invented anew with every pair of eyes. Television studies emerges from an understanding of a fundamental schism in looking: where the cinematic apparatus enables a gaze, the televisual counterpart can (traditionally) only produce a broken and distracted glance.

All of these theories begin with the act of looking, and are enabled by it in their metaphors, methods and practices. But in no instance is looking attended to as anatomical vision – the process of the “meat and bones” body and brain rather than the metaphysical consciousness. As a scholar of screen theory, my base interest in eye tracking comes down to this “problem”. Is it a problem? Should the biology and theory of looking align? What effects and contradictions arise when they are brought together?

Phenomenological screen theory is a key and complex pathway into this debate, as an approach that values embodied experience, but discredits the ocular—seeking to bring the whole body to spectatorship rather than privilege the centred and distant subject of optical visuality (Marks, 2002: xvi). Vivian Sobchack names film ‘an expression of experience by experience … an act of seeing that makes itself seen, an act of hearing that makes itself heard’ (Sobchack, 1992: 3). Eye tracking shows us the act of seeing – the raw fixations and movements with which screen content is taken in. In the study under discussion here it is this data that is of central interest, with our key questions deriving from what such material can verify about how narrative shapes gaze behaviour. A central question and challenge for me moving forward in this field, though, is to consider this process without ceding to ocularcentrism: that is, without automatically equating seeing to knowing. This ultimately means being cautious about reading gaze behaviour as ‘proof’ of what viewing subjects are thinking, feeling and understanding. This approach will be supported by the inclusion of further physiological measurements.

Researcher C: Interest in vision and how we see the world is an age-old interest, where it has been commonly held that the eyes are the windows to the mind. Where we look is then of great importance, as learning this offers us opportunities to understand more about where the brain wants to spend its time. Human eyes move independently from our heads and so our eyes have developed a specialised operating systems that both allows our eyes to move around our visual environment, and also counteract any movements the head may be making. This has led to a distinct set of eye movements we can study – saccades (the very fast blasts of movement that pivot our eye from focus point to focus point) – and fixations (brief moments of relative stillness where our gaze stops for a moment to allow the receptors in our eye to collect visual information). In addition, only a tiny area of the back of our eyeball, the fovea on the retina, is sensitive enough to gather highly ‘acuitive’ information, thus the brain must drive the eye around precisely in order to get light to fall onto this tiny area of the eye. As such, our eyes movements are an integral and essential part of our vision system.

Eye movement research has seen great advances during the last 50 years, with many early questions examined in the classic work of Buswell (1935) and Yarbus (1967). One question visual scientists and neuroscientists have been, and are still keen to, explore is why we look where we do: what is it about the objects or scene that draws our visual attention? Research over the decades has found that several different aspects are involved, relating to object salience, recognition, movement and contextual value (see Schütz et al., 2011). For animations that are used for learning purposes, Schnotz and Lowe (2008) discussed two major contributing factors that influence the attention-grabbing properties of features that make up this form. One is visuospatial contrast and a second is dynamic contrast; with features that are relatively large, brightly coloured or centrally placed, more likely to be fixated on compared to their less distinctive neighbours; and features that move or change over time drawing more attention.

Eye tracking research, which is now easier than ever to conduct, allows us to delve into examining how these and other features influence us, and is a unique way to gain access to the windows of the mind. Directing this focus to learning more about how we watch films, and in particular to animation, is what drove me to wanting to use eye tracking to better see how people experience these; and to delve into questions such as, what are people drawn to look at, and how might things like the narrative affect the way we direct our gaze?

When looking around a visual world, our view is often full of different objects and we tend to drive our gaze to them so we can recognize, inspect or use them. Not so surprisingly, what we are doing (our task at hand) strongly affects how we direct our gaze; such that as we perform a task, our salience-based mechanisms seem to go offline as people almost exclusively fixate on the task-relevant objects (Hayhoe, 2000; Land et al., 1999). From this, one expectation we have when considering how viewers watch animation is that more than salient features, aspects relating to the narrative components of the viewer’s understanding of the story will be the stronger drive. Another well-known drawcard for visual attention is towards faces, which tend to draw the eye’s attention very strongly (Cerf et al., 2009; Crouzet et al., 2010). For animated films we were interested to see if similar effects would be observed.

Finally, another strong and interesting effect that has been discussed is a tendency for people to have a central viewing bias, in which a large effect on viewing behaviour has been shown to be that people tend to fixate in the centre of a display (Tatler and Vincent, 2009). As this study was moving image screen based, we were keen to compare different scenes and how the narrative affected this tendency.

How we came to the project, and what we thought it might reveal

Researcher A: From a screenwriting perspective, I was excited to think that at last, we might have data that not only privileges the story (i.e., the screenwriter’s input), but that also highlights the minutiae of a scene that the screenwriter is likely to have influenced. This can be different in animation than in live action, whereby a team of story designers and animators actively shape the narrative as the ‘script’ emerges (see Wells, 2010). Nevertheless, if we follow that what we see on screen has been imagined or at least intended by a ‘writer’ of sorts – someone who knows about the composition of screen narratives – then it was rousing to think that this study might provide ‘evidence’ to support long-standing questions (for myself at least) of writing for the screen and authorship. Screenwriters work in layers, building a screenplay from broad aspects such as plot, character and theme, to micro aspects such as scene rhythm, dialogue and visual cues. Being able to ‘prove’ what viewers are looking at, and hoping that this might correlate with a screenwriting perspective of scene composition, was very appealing to me.

I was also interested in what other aspects of the screen viewers might look at, either as glances or as gazes. In some genres of screenwriting, such as comedy, much of the clever work comes around the edges: background characters; ironic landscapes; peripheral visual gags, etc. From a screenwriting perspective, then, it was exciting to think that we might find ways to trace who looks at what, and if indeed the texture of a screenplay is acknowledged by the viewer. The study would be limited and not all aspects could be explored, but as a general method for screen analysis, simply having ideas about what might be revealed led to some very interesting discussions within the team.

Researcher B: All screen theories rest upon a fundamental assumption that different types of content, and different viewing situations, produce different viewing behaviours and effects. Laura Mulvey’s famous theory of the gaze stipulates that classical Hollywood cinema and the traditional exhibition environment (dark cinema, large screen, audience silence) position men as bearers of the look and women as objects of the look, and that avant-garde cinemas avoid this configuration (Mulvey, 1975). New theories of digital cinema speculate upon whether a spectator’s identification with an image is altered when it bears no indexical connection to reality; that is, when the image is a simulated collection of pixels rather than the trace of an event that once took place before a camera (Rodowick, 2007). The phenomenological film theory of Laura Marks suggests that certain kinds of video and multimedia work can engender haptic visuality, where the eyes function like ‘organs of touch’ and the viewer’s body is more obviously involved in the process of seeing that is the case with optical visuality (Marks, 2002: 2-3). It made sense to begin our study into eye tracking by thinking about these different assumptions regarding content and context and formulating methods to analyse them empirically.

For our first project we chose to focus on an assumption regarding spectatorship that is more straightforward and essential than any listed above: namely that viewers can follow a story told only in images. This is an assumption that underpins the ubiquitous presence of the montage sequence in narrative filmmaking, where a large amount of story information is presented in a short, dialogue-free sequence. We hypothesized that by tracking a montage sequence we would be able to ascertain if and how viewers looked at narrative cues, even when these are not the most salient (i.e., large, colourful, moving) features in the scene. The study was in this way designed to start investigating how much film directors and designers can control subjects’ gaze behaviour and top-down (cognitively driven) processes.

The sequence from Up! was chosen in part to act as a ‘control’ against which we could later assess different types of content. The story told in the 4-minute sequence is complex but unambiguous, with its events and emotive power linked by clear relationships of cause and effect. It is in this way a prime example of a classical narrative style of filmmaking, where the emphasis is on communicating story information as transparently as possible (Bordwell, 1985: 160). Our hypothesis was that subjects’ gaze behaviour would be controlled by the tightly directed sequence with its strong narrative cues, and that this study could thereby function as a benchmark against which different types of less story-driven material could be compared later.

Researcher C: A colleague and I set up the Eye Tracking and the Moving Image (ETMI) research group in 2012, following discussions around how evidence was collected to support and investigate current film theory. These conversations grew into a determination to begin a cross-disciplinary research group, initially in Melbourne, to begin working together on these ideas. I had previously been involved in research using eye tracking to study other dynamic stimuli such as decision making processes in sport and the dynamics of signature forgery and detection, and my experience led to a belief that the eye tracker could have enormous potential as a research tool in the analysis and understanding of the moving image. Work on this particular study was inspired by the early aims of a subgroup (of which the other authors are a part), whose members were interested to investigate, in a more objective manner, the effect that narrative cues had on viewer gaze behaviour.

Existing research in our disciplines, and how that influenced our approaches to the study

Researcher A: While there had been research already conducted on eye tracking and the moving image, none of it had focussed on the creational aspects of screen texts: what goes into making a moving image text, before it becomes a finished product to be analysed. Much like screen scholarship that studies in a ‘post event’ way, what was lacking – usefully for us – was input from those who are practitioners themselves. The wider Melbourne-based Eye Tracking and the Moving Image research group within which this study sits has a membership that includes other practitioners, including a sound designer and a filmmaker. Combined, this suggested that our approach might offer something different; that it might ‘do more’ and hopefully speak to the industry as well as other researchers. As a screenwriter, the opportunity to co-research with scholars, scientists and other creative practitioners was therefore not only appealing, but also methodologically important.

As already highlighted, it was both an academic and a practical interest in the intersection of plot, character and theme that underpinned my approach. As Smith has argued, valuing character in screen studies has not always been possible (1995); moving this forward, valuing character, and in particular the character’s journey, has recently become more salient (see Batty, 2011; Marks, 2009), adding weight to a creative practice approach to screen scholarship. In this way, understanding the viewer’s experience of the screen seemed to lend itself well to some of the core concerns of the screenwriter; or to put it another way, had the ability to test what we ‘know’ about creative practice, and the role of the practitioner. Feeding, then, into wider debates about the place of screenwriting in the academy (see Baker, 2013; Price, 2013; 2010), it was important to value the work of the screenwriter, and in a scholarly rigorous – and hopefully innovative – way.

Researcher B: The majority of research on eye tracking and the moving image to date has been designed and undertaken as an extension to cognitive theories of film comprehension. Deriving from the constructivist school of cognitive psychology, and led by film theorist David Bordwell, this approach argues that viewers do not simply absorb but construct the meaning of a film from the data that is presented on screen. This data does not constitute a complete narrative but a series of cues that viewers process by generating inferences and hypotheses (Elsaesser and Buckland, 2002: 170). Bordwell’s approach explicitly opposes psychoanalytic film theory by attending to perceptual and cognitive aspects of film viewing rather than unconscious processes. Psychologist Tim Smith has mobilized eye tracking in connection with Bordwell’s work to demonstrate how this empirical method can “prove” cognitive theories of comprehension—showing that subjects’ eyes do fixate on those cues in a film’s mise-en-scène that the director has controlled through strategies of staging and movement (Smith, 2011; 2013).

The Up study was designed to follow in the wake of Smith’s work, with a particular interest in examining the premise of Bordwell’s theory – which is that narration is the central process that influences the way spectators understand a narrative film (Elsaesser and Buckland, 2002: 170). With this in mind, we deliberately chose a segment from an animated film, where the tightly directed narrative of the montage sequence is competing with a variety of other stimuli that subjects’ eyes could plausibly be attracted to: salient colourful and visibly designed details in the background and landscape of each shot.

We were also interested in this montage sequence for the highly affecting nature of its mini storyline, which establishes the protagonist Carl’s deep love for his wife Ellie as the motivation for his journey in Up! itself. The sequence carries a great deal of emotive power by contrasting the couple’s happiness in their long marriage with Carl’s ultimate sadness and regret at not being able to fulfill their life-long dream of moving to South America before Ellie falls sick and dies. Would it be possible to ‘see’ this emotional impact in viewers’ gaze behaviour?

How we reacted to the initial data, and what it was telling us.

Researcher A: When looking at data for the first time, I certainly saw a correlation between what we know about screenwriting and seeing, and what we could now turn to as evidence. For example, key objects such as the adventure book, the savings jar (see Fig. 1) and the picture of Paradise Falls – all of which recurred throughout the montage sequence – were looked at by viewers intensely, suggesting that narrative meaning was ‘achieved’.

Fig. 1. A heat map showing the collective intensity of viewers’ responses to the savings jar.

Fig. 1. A heat map showing the collective intensity of viewers’ responses to the savings jar.

As another example, when characters were purposely (from a screenwriting perspective) separated within the frame of the action, viewers oscillated between the two, eventually settling on the one they believed to possess the most narrative meaning (see Fig. 2). This further implied the importance of the character journey and its associated sense of theme, which for screenwriting verifies the careful work that has gone into a screenplay to set up narrative expectations.

Fig. 2. A gaze plot showing the fixations and saccades of one viewer in a scene with the prominent faces of Carl and Ellie.

Fig. 2. A gaze plot showing the fixations and saccades of one viewer in a scene with the prominent faces of Carl and Ellie.

Researcher B: We chose to analyse the data on Up! by examining how viewer attention fluctuated in focus between Carl and Ellie across the course of the montage sequence. The two are equal agents in the narrative at the beginning, but the montage’s story unfolds through the action and behaviour of each as it continues – that is, each character carries the story at different points. Overwhelmingly, the data supported this narrative pattern by showing that the majority of viewers fixated on the character who, moment by moment, functions as the agent of the story, even when that figure is not the most salient aspect of the image. Aligning with Bordwell’s cognitive theory of comprehension, this data confirms that viewers do rely principally on narrative cues to understand a film. As a top-down process of cognition, narrative exerts control over viewer attention to keep focus on the story rather than let the gaze wander to other bottom-up (salient) details in the mise-en-scène. It is this process that allowed Smith to show that viewers overwhelmingly will not notice glaring continuity errors on screen (Smith, 2005). As in the famous ‘Gorillas in our Midst’ experiment (Simons and Chabris, 1999), viewer attention is focused so closely on employing narrative schema to spatially, temporally and causally linked events that the salient stimuli on screen appears to be completely missed.

Researcher C: Initially I was quite interested to see the attention paid to faces, and in particular, characters’ eyes and mouths. Being animation, I had been keen to see if similar elements of faces would draw viewers’ eyes in the same ways that we look at human faces, where eyes and mouths are most viewed (Crouzet, et al., 2010). Here, even though the characters were not engaging in dialogue, their mouths as well as their eyes were still searched. Looking at eyes has been linked to looking for contextual emotional information (Guastella et al., 2007), and so with this montage sequence being non-verbal, it was not surprising to see much of the focus on characters’ eyes as viewers attempted to read the emotion though them (see Fig. 3).

Fig, 3. Two viewers’ gaze plots depicting the sequence of fixations made between Carl and Ellie.

Fig, 3. Two viewers’ gaze plots depicting the sequence of fixations made between Carl and Ellie.

Other areas I was interested to observe were instances when other well-known features drew strong viewer attention, such as written text and bright (salient) objects. Two particular scenes we examined contained examples of these. In one scene, in which the savings jar sits at the back of a dark bookshelf, viewers were both drawn to look at the bright candle in the foreground and also to the savings jar. The jar was in the dark, however with narrative cues to draw attention to it as well as the fact that it contained text, viewers were drawn to look at it (see Fig. 1). Surprisingly, in this scene other interesting objects are easily discernible – a wooden colourful bird figure; a guitar; a compass – yet the savings jar as well as the bright candles were viewed. The contextual information, the text and the salience appear to be working here to drive the eye, all within a few seconds of time.

Fig. 4. Gaze plots of fixations made by all viewers over the scene in which Carl purchases airline tickets.

Fig. 4. Gaze plots of fixations made by all viewers over the scene in which Carl purchases airline tickets.

The second scene to see text working as a cue for the eye was in the travel shop scene (Fig. 4). Here, viewers were drawn to look at two text-based posters placed on the back wall of the shop. Again, this scene was only shown momentarily, yet glances towards the text and images, as well as the exchange between the characters, give viewers the elements of the story they need to glean so that they know what is going on, and where the story will go next (Carl’s surprise for Ellie).

How over time we better understood the data, and what we began to know more

Researcher A: I was interested to see that some viewers spent time looking at the periphery. The Up! montage sequence did not necessarily offer ‘alternative’ layers in the margins of the screen, though given its created and controlled animated nature, it perhaps should not be a surprise that away from the centre of the screen there were visual delights, such as the sun setting over the city and a blanket of clouds that changed shape, from clouds to animals to babies. This suggested to me that in animation, because viewers know that images have been created from scratch, there is an expectation that the screen will offer a plethora of experiences, from narrative agency to visual amplification. This, in turn, suggested that in further studies, it might be useful to contrast texts that use the potential of the full screen to engage viewers with those that go in close and privilege the centre. Genre would most likely play a key role in this future endeavour.

Researcher B: As hoped, this pilot study has been instructive as a base from which we can now expand. It has raised many questions. One issue is that this data cannot ‘prove’ subjects were not seeing those elements on-screen that were not fixated upon – were they perhaps seeing them peripherally? This could only be confirmed by conducting interviews after the eye tracking takes place, and could instructively inform an understanding of how story information that is layered in the mise-en-scène (for instance in setting, lighting and costume) contributes to overall narrative comprehension. We are also very interested to determine how the context of viewing affects gaze behaviour. For instance, would subjects still fixate overwhelmingly on narrative cues when watching this sequence in a cinema environment on a large – even an IMAX – screen? In this environment the image on screen is larger and the texture more palpable. Would viewers here perhaps be more focused on these salient pleasures of the image and engage in a different, less cognitive experience of the film; letting their eyes roam across the grain of the shot in its colours, shapes and surfaces? Would results alter between an animated and live action film? Psychoanalytic film theory tells us that the cinematic apparatus promotes identification with characters and, by extension, the ideologies of the social system from which they are produced (Mulvey, 1975). Eye tracking can potentially intervene in this powerful theory of spectatorship by showing if and how viewers do fixate on the cues that give rise to this interpellation.

Researcher C: After looking at some of early scene analyses, I was somewhat surprised by how many eye movements could be made in fleetingly fast scenes, and at how many items in these scenes one could fixate on, if only briefly. I had expected viewers to be taking in some of the surrounding items in a scene using their peripheral vision, and to see more of the centralisation bias (Tatler and Vincent, 2009). Yet for some scenes, in particular for the two scenes in which Carl purchases the surprise airline tickets (see Figs 4 and 5), we see how viewers were drawn to search for narrative clues by looking around the scene.

Fig. 5. Gaze plot showing the fixations made by all viewers as they briefly see the contents of the picnic basket.

Fig. 5. Gaze plot showing the fixations made by all viewers as they briefly see the contents of the picnic basket.

In the first scene (see Fig. 4), Carl in seen in a shop, facing the shop assistant. Viewers had previously seen him in the midst of coming up with a bright idea. This scene thus gives the viewer a chance to work out what his idea was. What can be seen is that most viewers scanned the surrounds for clues. A similar pattern is seen in the next scene, in which we quickly glance at the contents of a picnic basket being carried by Carl (see Fig. 5). In the basket, which is seen close up, viewers scan the basket’s contents. It contains picnic items and the surprise airline ticket, and even though some glances went to other basket items, it was the ticket that captured most of the attention; the item that held the most narrative information. This item was also the most salient, being the clearest and brightest item in the basket, and, importantly, the only item to contain written text. In a very short glimpse of a scene, these features almost ensured that viewers’ eyes were directed to look at and acknowledge the ticket.

What excites us about the future of work in this area, and where we think it might take our own disciplines

Researcher A: If we are to fully embrace the creative practice potential of studies such as this, then we might look to creating new texts that can then be studied. If, in 1971, Norton and Stark created simple drawings to test how their subjects recognised and learned patterns, then over 40 years later, our approach might be to develop a short moving image narrative through which we can test our viewers’ gaze. For example, if we were to develop a short film and play it out of sequence (i.e., narrative meaning altered), might we affect where viewers look? Might they look differently: in different places and for different lengths of time? Similarly, what if we were to musically score a text in different ways, diegetically and non-diegetically? Might we affect the focus of viewer gaze? If so, what might this tell us about narrative attention and filmmaking techniques that sit ‘beyond the screenplay’?

For screenwriting as a discipline, studies such as these would serve two purposes, I feel. Firstly, they would help to strengthen the presence of screenwriting in the academy, especially in regard to innovative research that privileges the role of the practitioner. Accordingly, these studies could provide a variety of methodological approaches that might be of use to other screenwriting scholars; or that might be applied to other creative practice disciplines, in which researchers wish to understand the work that has gone into the creation of a text that might otherwise only be studied once it has been completed. Secondly, and perhaps more importantly, such studies might yield results that benefit, or at least inform, future screenwriting practices. Whether industry-related practices or otherwise, just like all ‘good’ creative practice research, the insights and understandings gained would contribute to the discipline in question in the form of ‘better’ or ‘different’ ways of doing (Harper, 2007). For me, this would reflect both the nature and the value of creative practice research.

Researcher B: All of the potential avenues for future research in this field take an essential interest in how moving images on screen produce a play between top-down and bottom-up cognition. In this, a larger issue for me – going back to the points I raised at the beginning of my section – is how the data can be mobilized beyond a strictly cognitive framework and vocabulary of screen theory. As indicated, the cognitive approach offers a deliberately ‘common sense’ counterpart to a paradigm such as psychoanalysis, with its reliance on myth, desire and fantasy (Elsaesser and Buckland, 2002: 169). Cognitive theory understands a film as a data set that a viewer’s brain processes and completes in an active construction of meaning – an understanding that eye tracking and neurocinematics is very well placed to support and expand. But most screen scholars appreciate and theorize film and television texts as much more than mere sets of data. The moving image is an experience that only ‘works’ by generating emotional affect, by engaging the viewer’s attachments, memories, desires and fears. Film theorist Linda Williams proposes that our investment in following the twists and turns of a narrative is fundamentally reliant upon the emotion of pathos: we continually, pleasurably invest in the expectation that a character will act or be acted upon in such a way that they achieve their goal, and continually, pleasurably have that expectation obscured and dashed by the story (Williams, 1998). So viewer attention is driven not just by a drive to know but also by a desire to feel: to be swept up in waves of hope and disappointment.

The mini storyline of the Up! montage sequence relies entirely on this dialectic of action and pathos. Carl and Ellie’s hopes are repeatedly frustrated, and Carl is finally unable to redeem this pattern before Ellie dies – producing a profound sense of pathos and regret as the defining theme of the sequence. We can see that our subjects’ fixations fell in line with this pattern as the sequence unfolded, consistently focusing on the character who was triggering or carrying the emotional power. But how do we distinguish the ‘felt’ dimension of this gaze out from the viewer’s efforts to simply comprehend what is happening by following characters’ movements, facial expressions or body language? How, that is, can we ‘see’ emotional engagement, and start to appreciate how this crucial dimension of spectatorship – based on feeling not thinking – governs the play between top-down and bottom-up cognition in moving pictures? For me, grappling with this problem – and perhaps experimenting with further measurements of pupil dilation, heart rate and brain activity – offers a fascinating pathway into understanding how eye tracking can move beyond an engagement with cognitive film theory to contribute to phenomenological thinking on genuinely embodied seeing and experience.

Researcher C: There is so much that can be done in this area, and that makes it an exciting pursuit; yet what makes it even more motivating is the way that we hope to go about it: collaboratively. One of the core aspects that members of ETMI are very passionate about is working together, bringing in different fields, different disciplines, different ways of seeing things, and building bridges between them. This work is not only about learning more about how we watch and interact with films, but also about having different perspectives on those insights. Work I would personally like to see undertaken in this way is to explore how black and white viewing compares to colourised viewing, and to explore whether and how 3D viewing affects how we gaze about a scene. To compare the gaze and emotional responses of children and adults to the same visual content, and similarly compare visual and emotional responses to material between males and females, and between genre fans and haters, is also an interesting possibility.

Finally, adding to these, I am excited about the potential collection and analysis of other physiological measures to better gauge emotional engagement. These include blood pressure, pupillometry, skin conduction, breathing rate and volumes, heart rate, sounds made (gasps, holding breath, sighs etc.) and facial expressions made.

Conclusion

By reflecting on each of our research backgrounds, experiences and expectations, what this article has revealed is that while we might have all come to the study with varied approaches and intentions, we have come out of the study with a somewhat surprisingly harmonious set of observations and conclusions. Without knowing it, perhaps, we were all interested in narrative and the role that characters play in the agency of it. We were also similarly interested in landscape and the visual potential of the screen; not in an obvious way, but in relation to subtext, meaning and emotion. The value of a study like this, then, lies not just in its methodological originality, but also in its ability to stir up passions in cross-disciplinary researchers, whereby each can bring to the table their own skills and ways of understanding data to reach mutual and respective conclusions. Although we ‘knew’ this from undertaking the study, the opportunity to reflect fully on the process in the form of an article has given us an even greater understanding of the collaborative potential of cross-disciplinary researchers such as ourselves.

 

References

Aristotle. (1996). Poetics. Trans. Malcolm Heath. London: Penguin.

Baker, Dallas. (2013). Scriptwriting as Creative Writing Research: A Preface. In: Dallas Baker and Debra Beattie (eds.) TEXT: Journal of Writing and Writing Courses, Special Issue 19: Scriptwriting as Creative Writing Research, pp. 1-8.

Batty, Craig, Adrian G. Dyer, Claire Perkins and Jodi Sita. (Forthcoming). Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative. In: CarrieLynn D. Reinhard and Christopher J. Olson (eds.). Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship. New York: Bloomsbury.

Batty, Craig. (2013) Creative Interventions in Screenwriting: Embracing Theme to Unify and Improve the Collaborative Development Process. In: Shane Strange and Kay Rozynski. (eds.) The Creative Manoeuvres: Making, Saying, Being Papers – the Refereed Proceedings of the 18th Conference of the Australasian Association of Writing Programs, pp. 1-12.

Batty, Craig. (2011). Movies That Move Us: Screenwriting and the Power of the Protagonist’s Journey. Basingstoke: Palgrave Macmillan.

Batty, Craig and Zara Waldeback. (2008). Writing for the Screen: Creative and Critical Approaches. Basingstoke: Palgrave Macmillan

Bordwell, David. (1985). Narration in the Fiction Film. London: Routledge.

Buswell Guy. T. (1935). How People Look at Pictures. Chicago: Chicago University Press.

Cerf, Moran, E. Paxon Frady and Christof Koch. (2009). Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision, 9(12): 10, pp. 1–15.

Crouzet, Sebastien M., Holle Kirchner and Simon J. Thorpe. (2010). Fast saccades toward faces: Face detection in just 100 ms. Journal of Vision, 10(4): 16, pp. 1–17.

Egri, Lajos. (2004). The Art of Dramatic Writing. New York: Simon & Schuster.

Elsaesser, Thomas and Warren Buckland. (2002). Studying Contemporary American Film: A Guide to Movie Analysis. London: Hodder Headline.

Guastella, Adam J., Philip B. Mitchell and Mark R Dadds. (2008). Oxytocin increases gaze to the eye region of human faces. Biological Psychiatry, 63, pp. 3-5.

Harper, Graeme and Jonathan Rayner. (2010). Cinema and Landscape. Bristol: Intellect.

Harper, Graeme. (2007). Creative Writing Research Today. Writing in Education, 43, p. 64-66.

Hayhoe, Mary. (2000). Vision using routines: A functional account of vision. Visual Cognition, 7, pp. 43–64.

Hockley, Luke. (2007). Frames of Mind: A Post-Jungian Look at Cinema, Television and Technology. Bristol: Intellect.

Land, Michael F., Neil Mennie and Jennifer Rusted. (1999). The roles of vision and eye movements in the control of activities of daily living. Perception, 28, pp. 1311–1328.

Marks, Dara. (2009). Inside Story: The Power of the Transformational Arc. London: A&C Black

Marks, Laura U. (2002). Touch: Sensuous Theory and Multisensory Media.

Minneapolis: University of Minnesota Press.

Mulvey, Laura. (1975). Visual Pleasure and Narrative Cinema. Screen, 16(3), pp. 6-18.

Norton, David, and Lawrence Stark. (1971). Scanpaths in eye movements during pattern perception. Science, 171, pp. 308–311.

Price, Steven. (2013). A History of the Screenplay. Basingstoke: Palgrave Macmillan.

Price, Steven. (2010). The Screenplay: Authorship, Theory and Criticism. Basingstoke: Palgrave Macmillan.

Rodowick, David. (2007). The Virtual Life of Film. Cambridge, MA: Harvard University Press.

Schnotz, Wolfgang and Richard K. Lowe. (2008). A unified view of learning from animated and static graphics. In: Richard K. Lowe and Wolfgang Schnotz (eds.). Learning with animation: Research implications for design. New York: Cambridge University Press, pp. 304-356.

Schütz, Alexander C., Doris I. Braun and Karl R. Gegenfurtner. (2011). Eye movements and perception: A selective review. Journal of Vision, 11(5), pp. 9, 1–30.

Simons, Daniel J. and Christopher F. Chabris. (1999). Gorillas in our Midst: Sustained Inattentional Blindness for Dynamic Events. Perception, 28, pp. 1059-1074.

Smith, Murray (1995). Engaging Characters: Fiction, Emotion, and the Cinema. Oxford: Oxford University Press.

Smith, Tim J. (2005). An Attentional Theory of Continuity Editing. [accessed October 17, 2014].

Smith, Tim J. (2011). Watching You Watch There Will Be Blood. [accessed August 22, 2014].

Smith, Tim J. (2013). Watching you watch movies: Using eye tracking to inform cognitive film theory. In: A. P. Shimamura (ed.). Psychocinematics: Exploring Cognition at the Movies. New York: Oxford University Press, pp. 165-191.

Sobchack, Vivian (1992). The Address of the Eye: A Phenomenology of Film Experience. Princeton, N.J: Princeton University Press.

Stadler, Jane (2010). Landscape and Location in Australian Cinema. Metro, 165.

Tatler, Benjamin W., and Benjamin T. Vincent. (2009). The prominence of behavioural biases in eye guidance. Visual Cognition, 17, pp. 1029–1054.

Tobii Technology (2005). User Manual. Tobii Technology AB. Danderyd, Sweden.

Vogler, Christopher (2007). The Writer’s Journey: Mythic Structure for Writers. Studio City, CA: Michael Wiese Productions.

Wells, Paul (2010). Boards, Beats, Binaries and Bricolage – Approaches to the Animation Script. In: Jill Nelmes (ed.) Analysing the Screenplay, Abingdon: Routledge, pp. 104-120.

Wells, Paul (2007) Basics Animation 01: Scriptwriting. Worthing: AVA Publishing.

Williams, Linda (1998). Melodrama Revised. In: Nick Browne (ed.). Refiguring American Film Genres: History and Theory. Berkeley, CA: University of California Press.

Yarbus, Alfred L. (1967). Eye Movements and Vision. New York: Plenum.

 

List of figures

Fig. 1. A heat map showing the collective intensity of viewers’ responses to the savings jar. Source: author study.

Fig. 2. A gaze plot showing the fixations and saccades of one viewer in a scene with the prominent faces of Carl and Ellie. Source: author study.

Fig, 3. Two viewers’ gaze plots depicting the sequence of fixations made between Carl and Ellie. Source: author study.

Fig. 4. Gaze plots of fixations made by all viewers over the scene in which Carl purchases airline tickets. Source: author study.

Fig. 5. Gaze plot showing the fixations made by all viewers as they briefly see the contents of the picnic basket. Source: author study.

 

Notes

[i] A full analysis of this study, ‘Seeing Animated Worlds: Eye Tracking and the Spectator’s Experience of Narrative’, will appear in the forthcoming collection Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship, edited by CarrieLynn D. Reinhard and Christopher J. Olson.

[ii] See Batty, Craig, Dyer, Adrian G., Perkins, Claire and Sita, Jodi (forthcoming) for full results.

 

Bios

Associate Professor Craig Batty is Creative Practice Research Leader in the School of Media and Communication, RMIT University, where he also teaches screenwriting. He is author, co-author and editor of eight books, including Screenwriters and Screenwriting: Putting Practice into Context (2014), The Creative Screenwriter: Exercises to Expand Your Craft (2012) and Movies That Move Us: Screenwriting and the Power of the Protagonist’s Journey (2011). Craig is also a screenwriter and script editor, with experiences across short film, feature film, television and online drama.

Dr Claire Perkins is Lecturer in Film and Screen Studies in the School of Media, Film and Journalism at Monash University. She is the author of American Smart Cinema (2012) and co-editor of collections including B is for Bad Cinema: Aesthetics, Politics and Cultural Value (2014) and US Independent Film After 1989: Possible Films (forthcoming, 2015). Her writing has also appeared in journals including Camera Obscura, Critical Studies in Television, Celebrity Studies and The Velvet Light Trap.

Dr Jodi Sita is Senior Lecturer in the School of Allied Health at the Australian Catholic University. She works within the areas of neuroscience and anatomy, with expertise in eye tracking research. She has extensive experience with multiple project types using eye tracking technologies and other biophysical data. As well as her current research using into viewer gaze patterns while watching moving images, she is using eye tracking to examine expertise in Australian Rules Football League coaches and players, and to examine the signature forgery process.

Volume 24, 2014

Themed Issue: Intermediations

Edited by Kevin Fisher and Holly Randell-Moon

Contents:

1. Editorial Introduction — Kevin Fisher and Holly Randell-Moon

2. Animating Ephemeral Surfaces: Transparency, Translucency and Disney’s World of Color  — Kirsten Moana Thompson

3. Vertical Framing: Authenticity and New Aesthetic Practice in Online Videos — Miriam Ross

4. Attached To My Devices: Across Individual, Collective and Panspectric Worlds — John Farnsworth

5. The Ecstatic Gestalt in Werner Herzog’s Cave of Forgotten Dreams — Kevin Fisher

6. Intermediality and Interventions: Applying Intermediality Frameworks to Reality Television and Microblogs — Rosemary Overell

7. ‘God Hates Fangs’: Gay Rights As Transmedia Story in True Blood — Holly Randell-Moon

8. We are the Borg (in a good way): Mapping The Development Of New Kinds Of Being And Knowing Through Inter- and Trans-Mediality — Anne Cranny Francis

Editorial: Intermediations — Kevin Fisher & Holly Randell-Moon

This special issue developed out the Intermediations symposium held at the University of Otago on May 31, 2013,[1] and on the invitation of keynote speaker and Refractory Editor, Angela Ndalianis. Presenters at this symposium who have contributed essays here include Kirsten Moana Thomson (the other keynote speaker), John Farnsworth, Kevin Fisher, and Miriam Ross. Topics at the symposium ranged across the terrain of intermedia and transmedia theory, provoking new lines of inquiry on both fronts, and drawing into question the complex relationships between the two emerging paradigms. It is from the extended conversations during and following the symposium that the issue expanded to include essays by Anne Cranny Francis, Rosemary Overell, and Holly Randell-Moon. Some of these essays directly engage the intermedia/transmedia relationship. Kirsten Moana Thompson explores the affinities between animation and more ephemeral forms of theatrical exhibition at Disney theme parks in terms of the sensual dimensions of colour. Rosemary Overell considers the affective intermedial dimensions of the reception and blogging practices surrounding the rehab-based reality TV show Intervention (A&E Network, 2005-2013). Anne Cranny Francis analyses the development of the Sherlock Holmes story world within the convergence culture of transmedia.

Other essays, while working more decisively on one side of the inter/trans spectrum, challenge or expand upon existing approaches in ways that suggest new dialogues. Miriam Ross’s essay investigates sociotechnical debates around vertical framing that issue from the convergence of video and cell phone technologies, and explores their implications within her own media practices. John Farnsworth combines psychological theories of ‘attachment’ with affect studies to suggest how mobile devices simultaneously augment and substitute for social relations. Kevin Fisher describes how the use of 3D imagery in the documentary Cave of Forgotten Dreams (Werner Herzog, 2010) stages the intermedial encounter between a human and pre-human consciousness. Holly Randell-Moon analyses how allusions to civil rights advocacy and debate in True Blood (HBO, 2008-2014) work in the service of the biopolitical management of difference under the aegis of transmedia consumer participation. Together, the essays constitute a critical inquiry into the emergence of inter- and transmedia in the disciplines of media, cultural and film studies and how these terms both illustrate and re-ignite sociotechnical forces and debates in digital media and convergence culture. In the following section, we offer a brief genealogy of inter/trans media analysis, focusing specifically on the terms’ phenomenological and ideological valences in scholarly reception and utility.

In between and among: a brief tracing of inter/trans media analysis

Over the past two decades academic discussions of intermediality and transmediation have undergone a parallel development within the context of what Henry Jenkins describes as digital convergence culture. However, the exponents of each have, with few exceptions, tended to talk past one another. This is paradoxical insofar as the phenomena they respectively describe are often intertwined in the media examples they differently engage. While transmedia analysis has been primarily concerned with the distribution of narrative across media platforms, intermedial analysis has interrogated the internal singularity and ‘specificity’ of those same medialities. The experience of transmediation involves the participation of interpretive communities in the co-creation of stories and the enactment of story worlds. By contrast, intermedial experience unfolds within the heterogeneous spaces generated along the various intersections of medial forms and traces within a given medium.

The subject of transmedia combines the active viewer of cultural studies and the social media user within an expanded understanding of narrative as an irreducible component of human experience, cognition and social activity. This anthropological notion of homo narrativus is shared by the academic methods of transmedia analysis as well as creative methods of transmedia storytelling co-emergent with commercial practices such as viral marketing. Scholarly interest in this ‘new’ form of storytelling can be traced to Alvin Toffler’s development of the term ‘prosumer’, coined to describe a shift in audience and consumer activity that was more self-directed, individualised and selective than the traditional mass media model of consumption and production (1980). Following on from this work, Axel Bruns (2008) and Henry Jenkins (2006) have explored how the ‘produser’ repositions the production and communication flows of media content from media companies and creators to the consumer/user. As Jenkins explains, “Reading across the media sustains a depth of experience that motivates consumption” (2003) and “A good transmedia franchise attracts a wider audience by pitching the content differently in the different media” (2003). Audiences can read media texts with an awareness of their transmedial dimensions or they can consume different media forms in isolation whilst still being interpellated into a broader transmedia story. Jenkins’ development of transmedia is thus an attempt to capture the new specificities of medial engagement that have emerged from digital convergence and new media formats. He identifies a number of transmedia modes of communication which include: transmedia storytelling, transmedia branding, transmedia performance, transmedia ritual, transmedia play, transmedia activism, and transmedia spectacle (2011). [2]

There are two important implications to be drawn from this type of cross-media communication. The first is that transmedia forms of communication require an explicit appreciation of the intertextual (though not necessarily intermedial) elements of storytelling on the part of media producers. The second is that this type of media storytelling and communication recognises the social character of narrative and textual construction. Writing about transmedia fan activity, Jenkins speaks of “a new kind of cultural power emerging as fans bond together within larger communities, pool their information, shape each other’s opinions, and develop a greater self-consciousness about their shared agendas and common interests” (2007, 362-363). Kaarina Nikunen also suggests that fan activities reveal “the institutional and technological spaces of shaping the pleasures of media” which also “possibly reshape […] audience practises more widely” (2007, 111). What this type of media engagement does is shift political and ideological discussion of audiences’ (pleasurable and social) involvement in meaning making from the passive/active consumer debate to questions of the audience’s role in the economy of media production and consumption.

It is this seeming incorporation of fan and audience desire into the narratives of media production that has generated scepticism about the extent to which produsage challenges or subverts existing media structures. S. Elizabeth Bird for example, points out that “True produsers are a reality, but they are not the norm, and can often seem to be so in thrall to big media and technological ‘coolness’ that they accept the disciplining of their creative activities” (2011, 512). Indeed, the end goal of transmedia branding according to social media marketer Rick Liebling is “creating an environment that is so authentic and compelling that when consumers do generate their own content that utilizes your brand, they do so in a way that is in line with your existing messaging” (2011; emphasis in original). For this reason, fan activity as a form of produsage qua consumer action (or more idealistically, resistance) may also be understood as “a form of market-sanctioned cultural experimentation through which the market rejuvenates itself” (Holt, as cited in Kline 2009, 32). The critical distance between a marketing approach to transmedia activity and a more scholarly one is the extent to which audience activity can instantiate resistance or subversion to existing media and communication hierarchies. Indeed, such concerns as they relate to media’s enmeshment in other political institutions specifically inform Randell-Moon’s essay in this issue.

One of the more salient critiques of transmedia analysis is that medial specificity is subsumed within the overall importance of the story, even if as Jenkins argues, transmedia storytelling relies neither on the continuity nor homogeneity of its narrative. Still, according to Jenkins, “Most discussions of transmedia place a high emphasis on continuity—assuming that transmedia requires a high level of coordination and creative control and that all of the pieces have to cohere into a consistent narrative or world” (2011). For Ndalianis it is the “holes” within transmedia stories that create opportunities for audience co-creation and performance, and that these types of co-creation are among the most successful examples of transmedia campaigns (2012, 174). Yet, even with its emphasis on the cross-media processes of audience engagement, transmedia still implies a substrate of medial relations where there is an experiential sameness across platforms. As Bernd Herzogenrath notes, the transmedial version of intermediality “is built on the concept that there are formal structures (such as narrative structures) that are not specific to one medium but can be found (perhaps differently instantiated) in different media” (2012, 4). Consequently, transmedia analysis “has the problem that ‘media specificity’ cannot be conceptualized within it” (4). By contrast, the issue of media specificity takes centre stage in Francesco Casetti’s analysis of the “relocation of cinema” as medial form beyond its traditional substrate (2011), which also animates Ross’s examination of the convergence of video and mobile telephony in this issue. This centrifugal thrust of intermedial analysis against the internal coherence and specificity of medialities within what Rosalind Krauss terms “the post medium condition” (1999) provides a counterpoint to the centripetal force of narrative implied in Jenkins’ convergence culture.

In this issue, Cranny-Francis traces the term intermediality back to Roland Barthes, where he appeals to the interdisciplinarity required by new cultural objects that defy prevailing codes and classifications. She argues that intertextuality, in Mikhail Bakhtin’s sense of “heteroglossia”, provides the methodological link between intermedial and transmedial analysis. Transmedia storytelling is, in important ways, an inherently intermedial phenomenon because it depends on and generates engagement with media texts as multiple and heterogeneous. The forms of reading and engagement across transmedia stories, as outlined by Jenkins, have similarities with intermedia defined by Herzogenrath as “between the between” (2012, 2) in the sense that “we can only refer to media using other media” (3). In relation to what Cranny-Francis describes as a process of endless intertextual deferral, Herzongenrath observes: “Individual media do not exist in isolation, to be suddenly taken into intermedial relations. Intermediality is rather the ontological condition sine qua non, which is always before ‘pure’ and specific media, which have to be extracted from the arch-intermediality” (4). The intermedial thus constitutes “the quicksand out of which specific media emerge” as well as “the various interconnections” made possible between the audience and different types of media (3).

Other contemporary theorists, such as Ágnes Pethő (2011) and Joachim Paech (2011), insist that intermediality is altogether distinct from intertextuality, which reproduces the privileging of narrative characteristic of transmedia, conflating relations between stories with intersections between medialities. Pethő, for example, describes intermedial experience as extra-narrative, extra-representational, and a-signifying. Hence, “it cannot be read” (2011, 67). Rather, as an encounter with the ‘in-between’ generated along the interstices of different medial forms and traces, intermediality makes itself felt on the prereflective level of embodied sensation. Hence, for Pethő, and contributors Moana-Thomson as well as Fisher, intermediality is an irreducibly phenomenological experience. Other essays, such as those by Farnsworth and Overell draw upon affect studies and non-representational theory to approach the embodied aspects of intermediality that escape both medium-specific and hermeneutic containment of media texts. For example, Farnsworth explores the affective and psychoanalytical dimensions of attachment as a constituent feature of embodiment and sociality that become augmented or constrained through mobile technologies.

However, the emphasis of the intermedial on embodiment and affect over interpretation has also informed some strains of transmedia theory, in particular Ndalianis’ work on transmedia horror as predicated on affective participation in a particular “sensorium” (2012). The focus of intermedial analysis on the heterogeneous spaces and experiences between medialities also complements the methodological and historiographical projects of media archaeology (Elsaesser 2005, 2009; Huhtamo and Parikka 2011; Parikka 2012) and remediation (Bolter and Grusin 1999). Paech, for example, echoes Jay David Bolter and Richard Grusin’s logic of remediation by arguing that film has always been intermedial, though its experience as such becomes more pronounced or “hypermediated” during historical periods characterised by intensified sociotechnical change (1999). At this moment in time, renewed interest in medial co-creation is heightened by the shifting economies of convergence culture and the post-medium environment, in whose context the paradigms of intermedial and transmedial analysis will continue to be subject to the same exchanges and mutations as the medialities they describe. Such mutations occur, we would argue, as intermediations between audience, text, screen and body as a constitutive feature of medial meaning and sensation.

In this issue, we offer some intermediations on the changing dynamics of mediality in relation to embodiment, media specificity, and audience participation in and performance of textuality. We hope you enjoy reading the essays.

 

References

Bird, Elizabeth S. 2011. “Are We All Produsers Now? Convergence and Media Audience Practices.” Cultural Studies 25 (4-5): 502-516.

Bolter, Jay David and Richard Grusin. 1999. Remediation: Understanding New Media.Cambridge: MIT Press.

Bruns, Alex. 2008. Blogs, Wikipedia, Second Life and Beyond: From Production to Produsage. New York: Peter Lang Publishing, Inc.

Casetti, Francesco. 2011. “Back to the Motherland: The Film Theatre in the Postmedia Age.” Screen 52 (1): 1-12.

Elsaesser, Thomas. 2005. “The New Film History as Media Archaeology.” Cinemas 14 (2-3 Spring): 75-117.

Elsaesser, Thomas. 2009. “Archaeologies of Interactivity: Early Cinema, Narrative and Spectatorship.” In Film 1900: Technology, Perception, Culture, edited by Klaus Kreimeier and Annemone Ligensa, 9-22. Indianapolis: Indiana University Press.

Herzogenrath, Bernd. 2012. “Travels in Intermedia[lity]: An Introduction.” In Travels in Intermedia[lity]: ReBlurring the Boundaries, edited by Bernd Herzogenrath: 1-14. Lebanon, NH: Dartmouth College Press.

Huhtamo, Erkki and Jussi Parikka, editors. 2011. Media Archaeology: Approaches, Applications, and Implications. Berkeley: University of California Press.

Jenkins, Henry. 2003. “Transmedia Storytelling.” Technology Review, January 15. Accessed June 28, 2014. http://www.technologyreview.com/news/401760/transmedia-storytelling/.

Jenkins, Henry. 2006. Fans, Bloggers, and Gamers: Exploring Participatory Culture. New York: New York University Press.

Jenkins, Henry. 2007. “Afterword: the future of fandom.” In Fandom: Identities and Communities in a Mediated World, edited by Jonathan Gray, Cornel Sandvoss and C. Lee Harrington, 357-364. New York: New York University Press.

Jenkins, Henry. 2011. “Transmedia 202: Further Reflections.” Confessions of an Aca-Fan, August 1. Accessed April 28, 2014. http://henryjenkins.org/2011/08/defining_transmedia_further_re.html.

Kline, Stephen. 2009. “Ronald’s New Dance: A Case Study of Corporate Rebranding in the Age of Integrated Communication.” In The Advertising Handbook (3rd edition), edited by Helen Powell, Jonathan Hardy, Sarah Hawkin and Iain MacRury, 24-33. London: Routledge.

Krauss, Rosalind. 1999. “A Voyage on the North Sea”: Art in the Age of the Post-Medium Condition. New York: Thames & Hudson.

Liebling, Rick. 2011. “Intermedia—The Next Phase in Consumer Engagement.” How Soon is Now?: Culture in a 24/7 World, September 11. Accessed April 28, 2014. http://www.rickliebling.com/2011/09/11/intermedia-the-next-phase-in-consumer-engagement/.

Ndalianis, Angela. 2012. The Horror Sensorium: Media and the Senses. Jefferson: McFarland Publishing.

Nikunen, Kaarina. 2007. “The Intermedial Practises of Fandom.” Nordicom Review 28 (2): 111-128.

Paech, Joachim. 2011. “The Intermediality of Film.” Acta Univ. Sapientiae, Film and Media Studies 4: 7-21. Accessed May 7, 2014. http://www.acta.sapientia.ro/acta-film/C4/Film4-1.pdf.

Parikka, Jussi. 2012. What is Media Archaeology? Cambridge: Polity Press.

Pethő, Ágnes. 2011. Cinema and Intermediality: The Passion for the In-Between. Newcastle upon Tyne: Cambridge Scholars Publishing.

Toffler, Alvin. 1980. The Third Wave. New York: Bantam Books.

 

Filmography

Ball, Alan. True Blood. 2008-2014. USA: HBO.

Herzog, Werner. 2010. Cave of Forgotten Dreams. USA: Sundance Selects.

Mettler, Sam. Intervention.2005-2013. USA: A&E Network.

 

Notes

[1] The “Intermediations” Symposium was organised by Catherine Fowler and Paul Ramaeker in conjunction with the Screen Cultures Research Group and the Department of Media, Film and Communication at the University of Otago.

[2] Of these types of transmedia communication, transmedia storytelling and branding appear to have captured scholarly and popular interest above the other significant and no less interesting forms of transmedia identified by Jenkins.

The Comfort and Disquiet of Transmedia Horror in Higurashi: When They Cry (Higurashi no naku koro ni) – Brian Ruh

It has been common in recent years for a Japanese entertainment property to encompass multiple forms of media. In fact, it has become unusual for a media product to not exist in more than one format. There are many different paths that this media progression can take – a manga (comics) series can be adapted into a TV anime (animation) series, a video game can receive a manga spinoff, a television drama can be adapted from a novel, as well as countless other permutations and extensions. In this regard, the case of the Japanese property Higurashi: When they Cry (Higurashi no naku koro ni) is an intriguing one. The media franchise began as a series of visual novels[1], which are computer software produced by the intersection of text, static illustrated characters, and background images. Some visual novels may have a degree of interactivity, in which the user makes choices that determine the outcome, although Higurashi did not. These visual novels wetrre sold at Comiket, a large biannual gathering in Tokyo for fans to buy amateur-produced goods, particularly comics. The popularity of Higurashi led to the development of the story being retold in multiple media – comics, animation, live-action film, and additional computer games. These subsequent media not only took the stories from the original visual novels and adapted them in different formats, but they expanded upon the narratives, sometimes showing different events or different aspects of the characters.

Figure 1: Menu screen of Higurashi: When They Cry visual novel, and the introductory screen to the Onikakushi-hen (‘Abducted by Demons Arc’), 07th Expansion, 2002

Figure 1. Menu screen of Higurashi: When They Cry visual novel, and the introductory screen to the Onikakushi-hen (‘Abducted by Demons Arc’), 07th Expansion, 2002.

Marc Steinberg proposes a specific approach to contemporary media properties in Japan that he calls the “anime media mix” that can help to explain what is occurring within the Higurashi property. Steinberg asserts that the media mix (media mikkusu in Japanese) in general is “the Japanese term for what is known in North America as media convergence.”[2] In the book Convergence Culture, Henry Jenkins discusses this phenomenon at some length. By this term, he means “the flow of content across multiple media platforms, the cooperation between multiple media industries, and the migratory behavior of media audiences who will go almost anywhere in search of the kinds of entertainment experiences they want.”[3] One of the results of media convergence is the growth in “transmedia storytelling,” in which individual (and sometimes self-contained) narratives are communicated in different ways through multiple media that all contribute to an overarching story. According to Jenkins, this is “the art of world making. To fully experience any fictional world, consumers must assume the role of hunters and gatherers, chasing down bits of the story across media channels, comparing notes with each other via online discussion groups, and collaborating to ensure that everyone who invests time and effort will come away with a richer entertainment experience.”[4] In other words, transmedia storytelling is the idea of using multiple media to tell a single cohesive story through various means, be it film, television, comics, online websites, and the like, all of which contribute to the singular “fictional world.” It should be noted that although Jenkins’s examples and the cases like Higurashi both involve a kind of storytelling across various media, there are some key differences. The examples that Jenkins describes, which are primarily American and in English, seem to fit what Steinberg would term the “marketing media mix,” which “aims to use the synergetic effect of multiple media in concert to focus the consumer toward a particular goal—the purchase of the advertiser’s product as the final endgame.”[5] In contrast, Steinberg describes the “anime media mix” as having “no single goal or teleological end; the general consumption of any of the media mix’s products will grow the entire enterprise.”[6] Since Higurashi as a media property has multiple points of entry, it has developed into a good example of the anime media mix, although as we will see it did not initially begin that way.

This article analyzes Higurashi as an example of contemporary transmedia horror, paying attention to how its horror elements are explicated across different media. In order to understand this, I begin by explaining in detail how the worlds of Higurashi are structured and the various media in which it participates. From these examples, I demonstrate that the function of the Higurashi media is twofold – through their use of the horror genre, the media both reassure and disturb the viewer. In order to analyze the dual functioning of horror in this manner, I proceed with an investigation of Kunio Yanagita’s early twentieth century ethnographic study Tōno monogatari.[7] Finally, I examine the theories of critics Hiroki Azuma and Eiji Ōtsuka and what they say about Japanese transmedia properties in order to explain how people interact with and consume a series like Higurashi. Through my analysis I will demonstrate that the transmedia horror of Higurashi is effective not only because of the tension between its familiar and unfamiliar elements, placing comforting nostalgia and isolating dread at odds with each other, but also because its multiple media forms allow the consumer to alternately experience enjoyment being around the characters and the shocks and gruesomeness of the deadly mysteries at the heart of the series.

The Structure of Higurashi

The story of Higurashi is intentionally complex and intricate, and its structure is worth analyzing in some detail. It was originally released as a series of eight visual novels from 2002 through 2006. Each visual novel was called a hen, or arc, and told part of the events that happened in the rural Japanese town of Hinamizawa in June 1983. There are certain plot elements common to all eight of the arcs. For example, in each one a teenaged boy named Keiichi has recently moved with his family to Hinamizawa and has begun making friends with four girls in his class – Mion, Rena, Satoko, and Rika. There is an annual event in the village called the Watanagashi (or “cotton-drifting”) festival, around which has swirled mystery and whispered rumor. For the past few years, following the Watanagashi festival, one person in the village has been killed and one person has mysteriously disappeared. These events are said to stem from the curse of Oyashiro-sama, the local deity who protects the town. It is said the god is still angry that years ago there was a plan to build a dam in the area, which would have submerged all of Hinamizawa. (It is also said that the villagers are descendants of demons who originally rose up from a local “bottomless” swamp and were subsequently pacified and given human form by Oyashiro-sama.) All of the people who have suffered the curse were involved, either directly or indirectly, with the dam project. In June 1983, the curse strikes again when two people – Takano, a nurse from the local clinic, and Tomitake, a photographer who regularly visits the town – are both mysteriously killed.  While these narrative conditions are set, the arcs of the eight original stories that make up Higurashi take divergent paths.

For example, in the first arc, Onikakushi-hen (or Abducted by Demons Arc), Keiichi is tentatively beginning to become accustomed to village life. He seems to be good at making friends with Mion, Rena, Satoko, and Rika. However, he begins perceiving that his friends and the rest of the town are keeping secrets from him regarding the Watanagashi festival, and his suspicions only increase when he finds a sewing needle in some rice balls his friends have made for him. In the end, driven by paranoia, he bludgeons Rena and Mion to death with a baseball bat in his room. Soon after, Keiichi dies from blood loss after feeling compelled to claw out his own throat.

In the second arc, Watanagashi-hen (or Cotton Drifting Arc), the events set up in the previous arc play out in a different manner. For example, in this arc Keiichi meets Shion, Mion’s twin sister, who goes along with Keiichi, Takano, and Tomitake to sneak into a sealed building containing sacred ceremonial instruments during the Watanagashi festival. (These instruments all happen to be sharp, nasty-looking implements of torture.) However, when Takano and Tomitake end up dead after the festival, Keiichi and Shion are fearful that they will both mysteriously disappear like the others who have run afoul of Oyashiro-sama’s curse. In the end, Mion confesses to being involved in the murders, after Keiichi discovers she has abducted and imprisoned her sister. Shion is rescued by the police, but Mion escapes custody. She later seeks Keiichi out to talk with him, but ends up stabbing him. Although Keiichi survives, he finds out from the police that they had found Mion’s dead body on her family estate before she met with him. That same night, Shion is found dead, having fallen from the balcony of the apartment where she was staying. The story ends with a ghastly Mion clawing her way onto Keiichi’s hospital bed to kill him.

A full account of the remaining Higurashi arcs would be beyond the scope of this article, but they all involve a combination of comforting friendship (the bonds being forged between Keiichi and his classmates) and the horrors of one or more character eventually killing some of the others in often gruesome ways. Although the arcs seem to reiterate ongoing cycles of paranoia and murder, toward the end of the sixth arc, Tsumihoroboshi-hen (Atonement Arc), Keiichi seems to remember some of what happened in the Abducted by Demons Arc, even though it does not make sense to him and does not reconcile with the fact that he knows he did not kill Rena and Mion in his current world.

It is not until the penultimate arc of the visual novel series, Minagoroshi-hen (Massacre Arc), that the overall structure of Higurashi is presented to the reader in full. We learn that Keiichi’s friend Rika has been repeating her life in Hinamizawa in June 1983 for over a hundred years, remembering everything that happens each time around. There is always some variation to the repetition, and the various arcs that have been presented so far are reflections of how Rika has organized her knowledge. She had been despairing that she no longer had the will to keep repeating the worlds alongside Hanyuu, a young female god who is the actual Oyashiro-sama and whom only Rika can see. However, Keiichi’s ability to see across the worlds in the Atonement Arc bolstered her confidence that she could effect change and end the cycle of repetition. The remainder of the Massacre Arc as well as the final Matsuribayashi-hen (Festival Accompanying Arc) consist of the group of friends trying to figure out how they can all escape the endless loop of June 1983.

The openness of the Higurashi text has allowed for a wide range of adaptations and expansions through multiple media. The original eight visual novel arcs were adapted into manga as well as an anime television series that ran for 50 episodes in 2006-7. These new media also expanded on the original themes of the visual novels by introducing new story arcs along with the adaptation. Additional story arcs were later introduced in later visual novels that could be played on systems like the Nintendo DS.[8] The fact that Keiichi and his friends often get together and play competitive games (card games, board games, word games, sports) has enabled further spin offs that are thematically related to the original Higurashi property, such as Higurashi no naku koro ni jan (a mahjong game)[9] and Higurashi Daybreak (a third-person shooting game),[10] both for the PlayStation Portable. Such properties prominently featured the Higurashi characters while often downplaying the horror elements.

However, it could be argued that the horrific elements of Higurashi stem from the lengthy depictions of Keiichi’s everyday life and the close interactions among his friends juxtaposed with a creeping sense of dread, as well as the brutality of the acts of assault and murder that often happen later in the story arcs. This violence is expressed in different ways across various media. Since Higurashi began as a visual novel, its composition presents an intriguing challenge for the construction and sustainment of horror effects, and the genre is not typically associated with the medium. As mentioned previously, visual novels in general communicate their narratives  through a combination of onscreen text, background images, manga-style character images, sound effects, and music. There is generally little to no onscreen movement, as well as infrequent choices to direct the course of the story. In Higurashi, however, the user is not presented with the opportunity to branch or deviate from the story. In his analysis of the visual novel, John Wheeler asserts, “The most important function of the algorithm in Higurashi is the lack of freedom it affords the player within the game-space. In this way, Higurashi is nothing like a print or digital novel, which offers the reader freedom to peruse the text and search within it either via an index or by using a digital search function.”[11] Indeed, the only options one is given in terms of interaction are where to save your place in the story and the speed at which the text appears onscreen. Unlike a traditional novel, it is not even possible to skip ahead. (I unfortunately encountered the consequences of this when one of my saved files became corrupted. Even though I knew my location in the story, I had to start the visual novel from the beginning.)

In contrast to the limitations of the visual novel, both the manga[12] and the anime adaptations[13] of Higurashi are able to be more expressive due to their greater use of framing and distinct approaches to characters and backgrounds. Although the manga format is generally constituted of black-and-white line drawings on paper (with the occasional color plate), there can be great variation in things like angle and panel composition from page to page. While an anime television series gains elements like color, movement, and sound, it can be constrained by a budget that may limit the number of shots or drawings per second, resulting in a product that may appear flat or static in places. However, each medium of adaptation provides its own unique pleasures. According to Wheeler, “As few of the background story elements and characters change fundamentally from iteration to iteration, part of the appeal of Higurashi as a property becomes the medium-to-medium translation itself, seeing changes in the perspective and style used to essentially tell the same stories.”[14] He goes on to argue that the anime “retains some of the static qualities of the visual novel, and a degree of continuity of visual aesthetic is established across adaptations” yet it is with the manga that the series “gains a true visual depth that reflects both the psychological states of its characters and the striking horror of its storyline.”[15] However, what is most important to realize is that all of the various Higurashi media serve as valid entry points to the series. Although it is not necessary to, say, read the manga after one has watched the anime in order to understand the characters or grasp the series’ mysteries, the fact that the various media emphasize different elements of the series encourages fans to seek out and experience the franchise in multiple forms. Unlike the Jenkins’s conception of convergence culture, this is not to “fill in the blanks” of missing elements and to make a single storyline more coherent, but rather to experience multiple, yet similar, storylines that occur in subtly separate narrative worlds. It also allows the viewer to spend more time with the characters as well as see how the different media depict the tension and horror of the story. As we will see with Tōno monogatari, the twin effects of comforting and disturbing the viewer are rooted in an approach to Japanese folklore and ethnography.

Figure 2: Shion comes for Keiichi in his hospital bed in the Higurashi manga version, Ryukishi07, 2008.

Figure 2. Shion comes for Keiichi in his hospital bed in the Higurashi manga version, Ryukishi07, 2008.

Transmedia and Japanese Horror – Nostalgia and Technological Advancement

Another key aspect of the horror of Higurashi is cultural, relating to concepts of technological representation and the role that the rural Japanese village plays in conceptions of “Japaneseness.” As alluded to above, many of the plot points in Higurashi rely on the idea of the curse of Oyashiro-sama. At various times throughout the story, different characters believe that they have been cursed by Hinamizawa’s guardian deity. Such curses are far from uncommon in Japanese film and comics. As Jay McRoy states, “the onryou, or ‘avenging spirit’ motif, remains an exceedingly popular and vital component of contemporary Japanese horror cinema.”[16] As McRoy points out in his chapter on contemporary Japanese horror directors Hideo Nakata and Takashi Shimizu, a great deal of current horror is intimately related to structures that are both comforting and confining (such as the family). For example, he identifies Shimizu’s film Ju-on: The Grudge (2002) as both conservative and progressive, saying that while “the film’s articulation of an apparent nostalgia for disappearing ‘traditions’ in the face of an emerging ‘modern’ socio-economic climate resonates with a conservative ideology that borders on the reactionary” it is also true that “the film advances a critique of a Japan still very much steeped in patriarchal conventions.”[17] Higurashi similarly walks the line between conservatism and progressivism. There is an emphasis on traditions, along with a fight to keep things the way they are in the village. For example, the Hinamizawa villagers are loath to have outside investigators looking too deeply into the Watanagashi incidents for fear it may either drive people away or may expose the people in power they think are responsible. Similarly, in one arc Keiichi has to stridently oppose the school and municipal systems in order to try to protect Satoko from her abusive uncle. He is continually told that he is being too much of a nuisance and that he should stop making waves. However, the solutions to problems in the Higurashi arcs often emphasize the need to rely on others and the power that comes from group action, emphasizing the power of love and acceptance, sometimes to an almost radical degree. For example, through persistence and hard work, Keiichi is finally able to rally the town to his cause and they are able to help Satoko escape from her uncle. Even though all of the characters of Higurashi have dark histories in one way or another, they are able to stand up for one another and brave seemingly insurmountable odds because they have acceptance and love for each other Such a scenario emphasizes the potential inherent in the “traditional” rural Japanese village that can occur when everyone is able to strive toward a common good. However, at other parts in Higurashi, the power of the village is suspect when Keiichi is trying to solve the mysterious deaths and he perceives himself as an outsider and that everyone is out to get him.

The fact that Higurashi was originally received on a computer screen as a kind of a “game” that required interaction puts it in good company with the themes of many other horror video games. (Although, as mentioned above, its interactivity was rather limited, the experience of the graphics, text and sound is probably closer to a game than it is to a book, comic, or animation.) Although the pairing of the video game medium and the horror genre is not unique to Japan, many such games are Japanese. As Chris Pruett writes, in games the “horror genre is home to a wide range of styles, including first-person games, third-person games, action oriented games, puzzle games, and even text-based games. Whatever the style of play, one fact cannot be ignored: the vast majority of horror video games come from Japan.”[18] Higurashi also shares commonalities of setting and subject matter with other Japanese horror video games. For example, Higurashi’s setting of an isolated Japanese village and the power and persistence of a local religion are similar to the Japanese game Siren, which was released in November 2003, shortly after the release of the first Higurashi visual novels. Pruett locates part of the source of the antagonistic horror of Siren in a tale from Japanese folklore: “The story of Yaobikuni involves a woman who eats the flesh of a mermaid and becomes immortal only to find that everlasting life is full of pain.”[19] However, in the case of Siren, it is the flesh of an alien creature that is eaten, rather than that of a mermaid. Interestingly, in the Atonement Arc of Higurashi, Rena has delusions that the Hinamizawa syndrome is due to an alien invasion, and that Oyashiro-sama is an alien, too. Similarly, Pruett argues, Siren demonstrates a contemporary Japanese discomfort with “cults and splinter religions.”[20] In Higurashi, Oyashiro-sama is, for the most part, discussed as something to be both respected and feared as a matter of precautionary common sense. However, characters who want to reinvigorate the widespread popular worship of Oyashiro-sama as a major deity are often depicted as antagonists. In many ways, this coincides with Jolyon Baraka Thomas’s analysis of representations of religions in anime and manga in which they have come “to be popularly associated with violence, brainwashing, and fraud.”[21] As demonstrated through these examples, references to mythology, folklore, and religion often play a strong negative role in Japanese media culture, and this is often the case throughout much of Higurashi.

In addition to religion playing a major role throughout Higurashi, the story makes specific references that situate the visual novels as specifically Japanese products. For example, in the second arc of the Higurashi visual novel, the group has a curry cooking competition at their school. They all fight their hardest, sometimes even resorting to trickery. In the end, Keiichi’s curry gets knocked over, and he ends up serving the judges rice balls with tea. Keiichi tries to convince the judges that “curry and the rice ball is virtually the same thing [sic]” He goes on to argue that “The Japanese have come up with many different dishes, but they all had one common theme: we are always looking for the best way to eat rice! … Both curry and rice balls are…the fruit of our precious culture!!” Mion then relates the story of a French chef who came to Japan and refused to use imported French ingredients, instead using what he could find locally. She says, “There should be no rules in the culture of food. It’s simply culture. If it comes to Japan, it blends with the Japanese culture and becomes something new. Therefore curry and rice balls are both part of Japanese culture.” Such references highlight Higurashi’s conceptualization as a Japanese product, but the franchise’s incorporation of Japanese folklore provides an even stronger emphasis.

In spite of its modern nature, Higurashi engages with a strain of Japanese folklore of the type seen in Kunio Yanagita’s famous Tōno monogatari  (The Legends of Tōno). This literary account of the oral folk tales found in the Japanese city of Tōno related to Yanagita by local informant and collaborator Kizen Sasaki, published in 1910, is often acclaimed as the starting point for Japanese folklore studies. In it, “Sasaki offers the vision of a typical Japanese villager who grows up in a world fraught with dangers from invisible forces and malevolent creatures shuttling between the human and animal kingdoms.”[22] As Ronald A. Morse points out in his introduction to the English translation, Yanagita’s account begins and ends with depictions of a festival, indicating the centrality of such events to village life.[23] The book details accounts of local gods who get jealous, people who mysteriously disappear without warning, villagers who violently kill other villagers, the behavior and worship of other local deities, and mysterious deaths as well as the return of people from the dead.

In Higurashi, one can see how these folkloric elements have been incorporated into a contemporary horror scenario. The life of the village of Hinamizawa depicted across various media still centers on a festival that celebrates the local guardian deity. Even the people in the village who are not active worshippers of Oyashiro-sama in their daily lives are shown according respect to such beliefs. Additionally, across the many Higurashi arcs, the line between the human world and the supernatural is shown to be thought of as being fluid. Even though many of the incidents depicted in Higurashi are later shown to be either delusion or the work of human actors, it is important that the belief persists that such events could occur. This is similar to Yanagita’s work in Tōno monogatari – the tales were related as factual not because the ethnographer necessarily believed they occurred, but because these were the stories that circulated in and around Tōno.

Not only are the stories in Tōno monogatari often seen as foundational for the field of Japanese ethnology, they are closely tied to concepts of the Japanese nation. Anthropologist Marilyn Ivy discusses that Tōno monogatari was written “at a time when regional beliefs and practices were being threatened by the comprehensive state ideology of ‘civilization and enlightenment’ (bunmei kaika).”[24] It was around this same time in the early twentieth century that saw the building of communication and transportation infrastructure, as well as mass emigration from the countryside to the cities (particularly Tokyo). This increasingly technologized nation created official policies that extolled “’traditional’ agrarian lifeways all the more effusively the more its policies destroyed those lifeways.”[25] Stories like those in Tōno monogatari were held up as being quintessentially Japanese, even as the irrationality of the stories served as a counterpoint to the government’s emphasis on reason and rationality. Ivy relates Yanagita’s tales to Freud’s ideas of the uncanny, noting that the fact that they had been generated around the same time was not coincidental.[26] According to one translation of Freud’s essay “The Uncanny,” “the nearest semantic equivalents in English” of the German word unheimlich “are ‘uncanny’ and ‘eerie’, but [it] etymologically corresponds to ‘unhomely.’”[27] Therefore, such stories are intimately related to a sense of comfort or home. Similarly, throughout the 1960s and 70s, Tōno and its stories became explicitly associated with the cultural idea of furusato or hometown. (This furusato concept can be applied in a general sense – it does not have to be one’s personal hometown.) Ivy writes, “Precisely because of the eerie character of its tales, Tōno became a particularly haunting and complex example of a generalized ideal.”[28]

In this analysis, we can further see in Higurashi that the horrifying allusions to Tōno monogatari and the sense of belonging Keiichi feels in the Hinamizawa as he makes new friends are in fact two sides of the same coin. The depiction of the rural Japanese town as both frightening and welcoming is not accidental. In fact, the two aspects necessarily coexist within contemporary concepts of the Japanese hometown. According to Ivy:

With the idea of Tōno as a furusato, then, there is a fusion of two horizons of desire. First, the desire to encounter the unexpected, the peripheral unknown, even (and even especially) the frightening–a desire that repeatedly reveals itself under the controlled and predictible conditions of everyday life in advanced consumer capitalism (in Japan as elsewhere); and second, a countervailing desire, pushed by an opposite longing, to return to a stable point of origin, to discover an authentically Japanese Japan that is disappearing yet still present, to encounter the always already known as coincident with one’s (Japanese) self. The desire for the different and unknown…is framed within the boundaries of a return to pastoral hominess, security, and (not the least significant) identity.[29]

In Tōno monogatari and its contemporary reception, elements of longing for home, horror, and identity exist in necessary tension with one another. These aspects also may be key elements that contribute to the attractiveness of Higurashi among consumers, as well as its longevity as a media franchise. Since the original visual novel was released in 2002, there has been a fairly steady stream of Higurashi-related media products and spinoffs. As befits Higurashi’s genesis as a product produced by a small team and sold at Comiket, this includes a significant number of amateur comics, many of which, but not all, involved portrayals of the characters in a sexual manner. This highlights the fact that, in spite of the fact that Higurashi is at its core a horror series, users will take the characters and appropriate them to fulfill their own desires.

Transmedia, Horror, and Desire

Due to the multi-arc structure of Higurashi, there are two aspects to the ways that the horror in the franchise is depicted – the narrative and the characters. In terms of the narrative, there are two levels. The first is the arc-level narrative, which encompasses everything that happens within a particular arc in the story. As mentioned previously, there were eight original arcs in the Higurashi visual novel series, but this has since been greatly expanded with additional arcs in anime, manga, and video games. Encompassing all of these arc-level narratives is a second, franchise-level narrative. Although the arc-level narratives have internal consistency, the larger franchise-level narrative cannot and does not reconcile the arc-level narratives. The number of arc-level permutations is near infinite, which means that the characters may undergo any number of horrific ordeals. However, these would not mean much to the viewer if they had become attached to the characters. The primacy of the Higurashi characters over narrative is particularly noticeable in some of the series’ recent incarnations. A four-episode direct-to-video anime series released in 2011-12 called Higurashi no naku koro ni Kira (dir. Hideki Tachibana) shifts the overall tone from horror to what might be called “erotic slapstick.” For example, the first episode is called Batsukoishi-hen (Penalty Love Arc) and is adapted from the epilogue of one of the original visual novels. It consists mainly of Keiichi and some of the other male characters fantasizing about the female characters dressed up in a variety of fetishized outfits. It has little to do with the plot of many of the other narrative arcs, but allows the viewer to spend more time with the characters and fantasize along with Keiichi. In this way, Higurashi points to the tension between two approaches to contemporary Japanese media properties – the theory of “narrative consumption” as put forth by Eiji Ōtsuka and the theory of “database consumption” put forth by Hiroki Azuma.

In his 1989 book A Theory of Narrative Consumption (Monogatari shōhiron), Ōtsuka analyzes how viewers interact with media properties. He asserts that such media succeed by “setting up their grand narrative or order in the background in advance and by tying the sales of concrete things to consumers’ awareness of this grand narrative.”[30] This grand narrative lies at the heart of a particular worldview, but is not something that can be directly sold and marketed itself. Therefore, “consumers are tricked into consuming a single cross-section of the system in the form of one episode of the drama, or a single fragment of the system in the form of a thing.[31] In other words, what is ultimately promised as the pinnacle of consumption in this media system – the grand narrative – can never be obtained by consumers. They can, however, access and purchase slivers of the narrative. In the case of Higurashi, Ōtsuka’s concept of the grand narrative is the overarching franchise-level narrative. However, in order to be able to access pieces of this narrative, consumers have to purchase a game, read a manga, or watch an anime episode. It must be said that the grand narrative in Higurashi is more fragmented than most Ōtsuka has in mind because is it not possible to reconcile all of the individual narrative arcs, due to the fact that they are permutations of possible worlds. This makes the grand narrative of Higurashi even more distant and difficult to access – not only are the fragments that the consumer can obtain pieces of a larger story, each larger story in Higurashi is an arc in an even bigger overarching narrative.

In his book Otaku: Japan’s Database Animals (Dōbutsuka suru posutomodan: otaku kara mita Nihon shakai) originally published in 2001, theorist Hiroki Azuma says that with the advent of postmodernity (a term he uses to “refer broadly to cultural conditions since the 1970s”[32]), Ōtsuka’s modern model of media consumption collapsed. Instead of a “tree” model, in which texts are derived from a deeper source with meaning, Azuma proposed a “database” model that solely works at the level of surface and does not point to a deeper meaning. According to Azuma, “As a result [of this shift], instead of narratives creating characters, it has become a general strategy to create character settings first, followed by works and projects, including the stories. Given this situation, the attractiveness of characters is more important than the degree of perfection of individual works.”[33] In such a model, “individual projects are the simulacra and behind them is the database of characters and settings.”[34] We can see that without Azuma’s theory of database consumption, some of the adaptations of Higurashi would not necessarily make any sense. For example, the Penalty Love Arc does not serve to advance the narrative of Higurashi in any way. The viewer does not discover anything new about the world or the characters. In the narrative consumption model, it is rather superfluous. However, in the database consumption model it makes perfect sense. Dedicated viewers have presumably spent many hours before the Penalty Love Arc watching and thinking about the characters, and perhaps fantasizing about them. Rather than presenting a part of a larger narrative world to consume, such texts present familiar and easily consumable characters.

Although Azuma presents his database consumption theory as a historical successor to Ōtsuka’s narrative consumption theory, it seems to fall prey to the assumption that the two models are in binary opposition. It seems more likely that, even in postmodernity, the two models can coexist. Higurashi is an excellent example of these two ways of theorizing media texts working simultaneously. There is certainly a narrative model at work in Higurashi, as the main emphasis of the original visual novel arcs is to try to figure out a way out of the curse of the repeating years and the gruesome deaths of the characters. The drive to solve this overarching mystery is at the heart of the consumption of Higurashi products. However, plenty of time is also spent with the characters as they interact with each other and help one another out with their problems. This then simultaneously emphasizes the characters, laying the groundwork for additional Higurashi products and adaptations that are divorced from the horror roots of the original visual novels.

Conclusion

As a franchise, Higurashi evolved from a small series of amateur-produced visual novels into a multimedia franchise in just a few years. As we have seen there are a number of elements that may have contributed to this rapid growth. Structurally, Higurashi uses the horror genre to constantly create a degree of threat to the characters the viewer is growing increasingly familiar with and attached to. By evoking the milieu of a rural Japanese village, Higurashi uses folklore to create a space that is both exciting and comfortable, unsettling yet familiar. Additionally, its multi-arc structure allows for near-unlimited narrative expansion, providing countless opportunities for fandom and consumptive practices. Within such expansive narrative spaces, though, there are definite constraints. Although some arcs in Higurashi take place before or after the events of June 1983, it is really only in that particular time period that all of the main characters are in the same place. This means that the majority of the narratives, both official and fan-created, will take place in this narrow strip of time, creating a kind of utopian space within the overall horror of the tragic events that the story is built around.

Existing in such paradoxical utopian spaces is not necessarily unique to the Higurashi franchise. In her analysis of the background art in Japanese games and anime, Kumiko Saito discusses the use of regional representations in the background art of Japanese anime and games, writing, “With the rapid introduction of digital technology to animation and game productions, the visibility of regional representation quickly grew with the success of anime / game works that feature background art by background art specialists.”[35] The emphasis on pastoral settings in so many games and anime “suggests an imagined locus of ‘middle ground,’ between urban and rural, or present and past” which “presents strong nostalgia toward suburban or rural everyday life, often presupposing the viewer’s non-diegetic knowledge that this happiness of mediocrity is ending soon.”[36] According to Saito, this is often associated with how such narratives play with concepts of temporality, including time travel, amnesia, and the ability to stop or delay time. Although Saito does not mention Higurashi specifically, it is clear that the franchise participates in these larger trends.

Even though Higurashi has its horrors, it still reliably provides the viewer with a comfortable space to which they can return and reunite with their favorite characters. As Saito asserts, “With multiple endings already tailored for repetitive gameplay, games and their anime adaptations, especially, invite the player to stay in the time loop between the beginning and the end, or between amnesia and recollection.”[37] Such contemporary media properties provide a way of remaining in a rarified space that exists outside of larger economic or geopolitical concerns. In the case of Higurashi, the perpetual June 1983 takes place before the bubble economy of the late 1980s, but still at a time of optimistic economic prosperity. However, as Saito puts it, continual engagement with such texts and franchises can have a negative impact on the perception of history, writing, “The regionalist narrative in popular visual media helps reestablish national pride in Japanese particularity, but only within the safe range of the personal and emotional without recovering the memory of Showa’s war and postwar periods or the nation’s geo-ethnic varieties. The inaccessible nature of background art as beautiful tableaus of Japan[‘s] paradoxical nature securely freezes the image of Japan.”[38] Perhaps it is this refusal to accept history and adapt, and a subsequent preference for continual states of play and the consumption of counterfactual worlds, that is the real horror.

Works Cited

Azuma, Hiroki. Otaku: Japan’s Database Animals. Translated by Jonathan E. Abel and Shion Kono. Minneapolis: University of Minnesota Press, 2009.

Dorson, Richard M. Foreword to the 1975 edition of The Legends of Tōno by Kunio Yanagita, xv-xix. Translated by Ronald A. Morse. Lanham, MD: Lexington Books, 2008.

Freud, Sigmund. The Uncanny. Translated by David McLintock. New York: Penguin Books, 2003.

Hills, Matt. Fan Cultures. New York: Routledge, 2002.

Ivy, Marilyn. Discourses of the Vanishing: Modernity, Phantasm, Japan. Chicago: The University of Chicago Press, 1995.

Jenkins, Henry. Convergence Culture: Where Old and New Media Collide. New York: New York University Press, 2006.

McRoy, Jay. Nightmare Japan: Contemporary Japanese Horror Cinema. Amsterdam: Editions Rodopi, 2008.

Ōtsuka, Eiji. “World and Variation: The Reproduction and Consumption of Narrative.” Translated by Marc Steinberg. Mechademia 5 (2010): 99-116.

Pruett, Chris. “The Anthropology of Fear: Learning About Japan Through Horror Games.” Loading 4, no. 6 (2010): http://journals.sfu.ca/loading/index.php/loading/article/view/90/87.

Ryukishi07. Higurashi no naku koro ni: Onikakushi-hen. Tokyo: 07th Expansion, 2002.

Ryukishi07. Higurashi no naku koro ni: Watanagashi-hen. Tokyo: 07th Expansion, 2002.

Ryukishi07. Higurashi no naku koro ni: Tatarigoroshi-hen. Tokyo: 07th Expansion, 2003.

Ryukishi07. Higurashi no naku koro ni: Himatsubushi-hen. Tokyo: 07th Expansion, 2004.

Ryukishi07. Higurashi no naku koro ni Kai: Meakashi-hen. Tokyo: 07th Expansion, 2004.

Ryukishi07. Higurashi no naku koro ni Kai: Tsumihoroboshi-hen. Tokyo: 07th Expansion, 2005.

Ryukishi07. Higurashi no naku koro ni Kai: Minagoroshi-hen. Tokyo: 07th Expansion, 2005.

Ryukishi07. Higurashi no naku koro ni Kai: Matsuribayashi-hen. Tokyo: 07th Expansion, 2006.

Ryukishi07, Higurashi no naku koro ni Kizuna: Dai Ichi Kan Tatari. Tokyo: Alchemist, 2008.

Ryukishi07, Higurashi no naku koro ni Kizuna: Dai Ni Kan Sō. Tokyo: Alchemist, 2008.

Ryukishi07, Higurashi no naku koro ni Kizuna: Dai San Kan Rasen. Tokyo: Alchemist, 2009.

Ryukishi07, Higurashi no naku koro ni Kizuna: Dai Yon Kan Kizuna. Tokyo: Alchemist, 2010.

Ryukishi07, Higurashi: When They Cry: Abducted by Demons Arc, Vol 1. New York: Yen Press, 2008.

Saito, Kumiko. “Regionalism in the Era of Neo-Nationalism: Japanese Landscape in the Background Art of Games and Anime from the Late-1990s to the Present.” In Asian Popular Culture: New, Hybrid, and Alternate Media, edited by John A. Lent and Lorna Fitzsimmons, 35-58. Lanham, MD: Lexington Books, 2013.

Sims, Higurashi no naku koro ni Jan. Tokyo: AQ Interactive, 2009.

Steinberg, Marc. Anime’s Media Mix: Franchising Toys and Characters in Japan. Minneapolis: University of Minnesota Press, 2012.

Thomas, Jolyon Baraka. Drawing on Tradition: Manga, Anime, and Religion in Contemporary Japan. Honolulu: University of Hawai’i Press, 2012.

Twilight Frontier. Higurashi Daybreak. Tokyo: Alchemist, 2008.

Wheeler, John. “The Higurashi Code: Algorithm and Adaptation in the Otaku Industry and Beyond.” Cinephile: The University of British Columbia’s Film Journal 7, no. 1 (2011): 25-29.

When They Cry (Higurashi no Naku Koro ni). Directed by Chiaki Kon. 2006. Long Beach, CA: Geneon Entertainment, 2007. DVD.

Yanagita, Kunio. The Legends of Tōno. Translated by Ronald A. Morse. Lanham, MD: Lexington Books, 2008.



[1] Ryukishi07, Higurashi no Naku Koro ni: Onikakushi-hen (Tokyo: 07th Expansion, 2002); Ryukishi07, Higurashi no Naku Koro ni: Watanagashi-hen (Tokyo: 07th Expansion, 2002); Ryukishi07, Higurashi no Naku Koro ni: Tatarigoroshi-hen (Tokyo: 07th Expansion, 2003); Ryukishi07, Higurashi no Naku Koro ni: Himatsubushi-hen (Tokyo: 07th Expansion, 2004); Ryukishi07, Higurashi no Naku Koro ni Kai: Meakashi-hen (Tokyo: 07th Expansion, 2004); Ryukishi07, Higurashi no Naku Koro ni Kai: Tsumihoroboshi-hen (Tokyo: 07th Expansion, 2005); Ryukishi07, Higurashi no Naku Koro ni Kai: Minagoroshi-hen (Tokyo: 07th Expansion, 2005); and Ryukishi07, Higurashi no Naku Koro ni Kai: Matsuribayashi-hen (Tokyo: 07th Expansion, 2006).

[2] Marc Steinberg, Anime’s Media Mix: Franchising Toys and Characters in Japan (Minneapolis: University of Minnesota Press, 2012), 135.

[3] Henry Jenkins, Convergence Culture: Where Old and New Media Collide (New York: New York University Press, 2006), 2.

[4] Jenkins, Convergence Culture, 21.

[5] Steinberg, Anime’s Media Mix, 141.

[6] Ibid.

[7] Kunio Yanagita. The Legends of Tōno. (Lanham, MD: Lexington Books, 2008).

[8] Such as the Kizuna series of visual novels for the DS, each of which included a new arc. Ryukishi07, Higurashi no Naku Koro ni Kizuna: Dai Ichi Kan “Tatari” (Tokyo: Alchemist, 2008); Ryukishi07, Higurashi no Naku Koro ni Kizuna: Dai Ni Kan “Sō” (Tokyo: Alchemist, 2008); Ryukishi07, Higurashi no Naku Koro ni Kizuna: Dai San Kan “Rasen” (Tokyo: Alchemist, 2009); and Ryukishi07, Higurashi no Naku Koro ni Kizuna: Dai Yon Kan “Kizuna” (Tokyo: Alchemist, 2010).

[9] Sims, Higurashi no Naku Koro ni Jan (Tokyo: AQ Interactive, 2009).

[10] Twilight Frontier, Higurashi Daybreak (Tokyo: Alchemist, 2008).

[11] John Wheeler, “The Higurashi Code: Algorithm and Adaptation in the Otaku Industry and Beyond,” Cinephile: The University of British Columbia’s Film Journal 7, no. 1 (2011): 27.

[12] Beginning with Ryukishi07, Higurashi: When They Cry: Abducted by Demons Arc, Vol 1 (New York: Yen Press, 2008).

[13] Beginning with When They Cry (Higurashi no Naku Koro ni), directed by Chiaki Kon (2006; Long Beach, CA: Geneon Entertainment, 2007), DVD.

[14] Wheeler, “The Higurashi Code,” 28.

[15] Wheeler, “The Higurashi Code,” 29.

[16] Jay McRoy, Nightmare Japan: Contemporary Japanese Horror Cinema (Amsterdam: Editions Rodopi, 2008): 75.

[17] McRoy, Nightmare Japan, 96.

[18] Chris Pruett, “The Anthropology of Fear: Learning About Japan Through Horror Games.” Loading… 4, no. 6 (2010): 2.

[19] Pruett, “The Anthropology of Fear,” 8.

[20] Pruett, “The Anthropology of Fear,” 9.

[21] Jolyon Baraka Thomas, Drawing on Tradition: Manga, Anime, and Religion in Contemporary Japan (Honolulu: University of Hawai’i Press, 2012): 125.

[22] Richard M. Dorson, Foreword to the 1975 edition of The Legends of Tōno by Kunio Yanagita (Lanham, MD: Lexington Books, 2008): xviii.

[23] Kunio Yanagita, The Legends of Tōno (Lanham, MD: Lexington Books, 2008).

[24] Marilyn Ivy, Discourses of the Vanishing: Modernity, Phantasm, Japan (Chicago: The University of Chicago Press, 1995): 70.

[25] Ivy, Discourses of the Vanishing, 71.

[26] Ivy, Discourses of the Vanishing, 85.

[27] Sigmund Freud, The Uncanny. (New York: Penguin Books, 2003): 124.

[28] Ivy, Discourses of the Vanishing, 105.

[29] Ibid.

[30] Eiji Ōtsuka, “World and Variation: The Reproduction and Consumption of Narrative,” Mechademia 5 (2010): 107.

[31] Ōtsuka, “World and Variation,” 109.

[32] Hiroki Azuma, Otaku: Japan’s Database Animals. (Minneapolis: University of Minnesota Press, 2009): 16.

[33] Azuma, Otaku, 48.

[34] Azuma, Otaku, 53.

[35] Kumiko Saito, “Regionalism in the Era of Neo-Nationalism: Japanese Landscape in the Background Art of Games and Anime from the Late-1990s to the Present,” in Asian Popular Culture: New, Hybrid, and Alternate Media, ed. John A. Lent and Lorna Fitzsimmons (Lanham, MD: Lexington Books, 2013): 40.

[36] Saito, “Regionalism in the Era of Neo-Nationalism,” 41.

[37] Saito, “Regionalism in the Era of Neo-Nationalism,” 49.

[38] Saito, “Regionalism in the Era of Neo-Nationalism,” 48.

 

Bio: Brian Ruh earned his PhD in Communication and Culture from Indiana University in 2012 with his dissertation “Adapting Anime: Transnational Media between Japan and the United States.” He has contributed articles and chapters to journals such as Mechademia: An Annual Forum for Anime, Manga and the Fan Arts and Intensities: The Journal of Cult Media as well as books like Cinema Anime: Critical Engagements with Japanese AnimationEast Asian Cinemas: Exploring Transnational Connections on Film, and The Japanification of Children’s Popular Culture: From Godzilla to Spirited Away. A second edition of his first book, Stray Dog of Anime: The Films of Mamoru Oshii, will be published by Palgrave Macmillan in Spring 2014.

The Power Girls Before Girl Power: 1980s Toy-Based Girl Cartoons – Katia Perea

Abstract: The socio/cultural history and partnership of toy advertisement and children’s television is rich and well documented (Schneider 1989, Kunkel 1988, Seiter 1993). In this article I discuss the influence of policy in girl’s cartoon programming as well as the relationship between commercialization and financial motivation in creating a girl cartoon media product. I then discuss the formulaic, gender normative parameters this new genre set in place to identify girl cartoons as well as girl media consumption and how within those parameters girl cartoon characters were able to represent an empowered girl popular culture product a decade before the nomenclature Girl Power. This research considers the socio-historical framework of programming in the 1980s toy-based cartoon era to assess how cartoons playfully promote a counter-hegemonic force on television’s socially compulsive gender coding. This research textually analyzed several episodes of Rainbow Brite, My Little Pony, Care Bears, Strawberry Shortcake and television girl cartoons from 1981-1988, to initiate a thematic coding scheme documenting what is occurring both verbally and visually regarding gender display and gender dynamics between characters. The coding was analyzed to identify systems of gender behavior that are both intentionally overt and naturally transgressive, traditional feminine traits and subtle, counter-normative characteristics. This includes, but is not limited to, clothing, behaviors, accessories, jokes, images, songs, background design, friendship dynamics and dialogue reproduced verbatim.

 

My Little Pony.

My Little Pony.

Introduction

Riot Grrrl[2] subculture and third wave feminism[3]are accredited as the cultural predecessor of the 1990s Girl Power popular culture (Taft 2004), minus the political consciousness or DIY consumer sensibilities; however, its commercialized predecessor, the 1980s toy-based girl cartoons, is what established the discourse on girl media culture as well as establishing a popular culture genre that associated consumerism with girl empowerment. The age group of the intended viewers for these 1980s girl cartoons grew up to be the teenagers and young adult women of the 1990s. The main distinction between these different types of Girl Power consumption is that the adventures of Rainbow Brite or the Little Ponies were inspiring young girl viewers to be empowered without sexualizing them.

Unlike the often overly sexualized portrayal of the adult female body in many cartoons, such as the buxom, corseted Wonder Woman, the curvaceous, mini-skirted She-Ra, or the boyfriend invested Daphne[4, 1980s toy-based girl cartoons had pre-pubescent girl characters who were all under the age of twelve. These girl cartoon lead characters were not tween, pre-teen or teenagers, a distinction within the definition of “girl” that had been under-explored in feminist media literature until the nomenclature of “girls studies” in the 1990s. This research found twelve to be the magic age that media gives girl characters boobs and boyfriends.[5] The under-twelve cartoon girl bodies of the 1980s were portrayed without any overt sexualization such as breasts, curves, sexually suggestive clothing or heteronormative romantic interest; the girl and boy characters are friends[6]

The 1990s Girl Power popular culture was heavily defined by its marketability; the things you consumed defined your girl power.  Its empowerment consumption was encased as depoliticized, individually expressed and purchasable (Taft 2004, Weeks 2004, Gonick 2006). Girl Power of the 1990s did not need girls to identify global sexism, it asked girls to be confident, pretty and sexy.  Its media representations were mostly young women that acceptably span from teenagers into elder adulthood.  It seemed not to matter how old you were, but it did seem to matter how young you were. The 1990s Girl Power’s representation was not for little girls, it was for post-pubescent girls and women; basically, girls with spending power and girls that can be sexualized, in other words, girls that were women. 

The 1980s girl cartoons were also defined by the marketability of the things girls consumed; the toys. Girls played with toys based on communicative and adventurous cartoons where they were leaders; it had nothing to do with being pretty for the boys. The 1980s toy-based cartoons created a realization, albeit a commodified one, that girls were a valuable target audience. While confidence and pretty things did abound in cartoons like Rainbow Brite and My Little Pony, the portrayal of strength was attributed to the cooperation within the group; friendship was the strength and its empowerment was in the girl, there were no sexy things.

These are key to describing the creation of girl power discourse within the mass consumed media product. These cartoon characters’ leadership, confidence, determination and savvies were delivered back a decade later as 1990s Girl Power in what Stuart Hall identifies as cultural ventriloquism (Hall 1981), where a subculture’s empowerment is absorbed by the culture industry, its dissidence removed, and delivered back, often to the group that originally created it. The constructed boundaries on girl’s empowerment in the 1990s Girl Power popular culture discourse is presented in the form of sexualized bodies and heteronormative concerns, characteristics not present in the 1980s television girl cartoons or their toys.

Little Lulu.

Little Lulu comic book.

Little Lulu – The First Girl Power Cartoon

Marjorie Henderson Buell, the first US woman cartoonist to achieve international fame, created Little Lulu as a single panel newspaper comic in 1935 for The Saturday Evening Post. With two previously successful syndicated strips under her belt, Marge, Buell’s pen name, was asked by the Post to create a successor to Henry, a Post cartoon strip about a little boy that had gone to national syndication. The Post was uncertain a girl character could be successful. When asked about creating Little Lulu, Buell explained to a reporter, “I wanted a girl because a girl could get away with more fresh stunts that in a small boy would seem boorish” (Jacob 2006).   Little Lulu became an instant success and the comic was soon made into a cartoon by Paramount.

While there were many lead cartoon boy characters in the Golden Era of theatrical cartoons, the first and only girl cartoon was Little Lulu 1943-1948 (Lenburg 2009). The Little Lulu cartoons were created for cinematic showings by Paramount’s animation production house from 1943-1948, and began syndicated television broadcast in the early 1950s (Woolery 1983, Erickson 2005). A master of deadpan delivery, Lulu displayed a willful resilience in the face of adversity. She was undaunted and unafraid, mischievous yet well-intentioned, and she was wildly successful.

Little Lulu toys.

Little Lulu toys.

 

 

Due to the character’s overwhelming popularity, Buell found herself presiding over a Little Lulu merchandising empire, including product endorsements; proving that Lulu was not just for girls.

Lulu was a hit. In 1944, she began a fifteen-year run as the star of advertisements for Kleenex tissues. By 1950, [creator] Margaret Buell was presiding over a merchandising empire that included Little Lulu dolls, lunch boxes, magic slates, coin purses, bubble bath, pajamas, and candy (Jacob 2006:1).

When her film contract license was up in 1948, Paramount studios tried to use the character’s theatrical publicity as leverage to cut Buell’s profits and claim part ownership of the character in exchange for the cartoon’s continued production; Buell refused to sell out her creation (Evanier 2007). Due to this licensing disagreement, Paramount stopped producing Little Lulu and in the 1950s sold the existing cartoons as syndicated children’s television programming (Erickson 2005). They aired sporadically in that decade and then left television.

Misogynistic Boys

A theme that runs through Little Lulu is the boy vs. girl rivalry that occurs with the secondary character Tubby, a neighborhood friend who often puts the sign “No Girls Allowed” on his clubhouse door, locking Lulu out of the boys’ activity inside. Tubby berates Lulu as a girl and revels in the superiority of his boyness; that is of course, until Lulu repeatedly outsmarts him and makes him appear foolish, disproving his supposed gender superiority.

I found that this gender-based rivalry ran through girl cartoons in later eras as well, where a boy character reacts in disgust to representations of the feminine or uses diminutive gender-based comments against the lead girl, referring to the girl as weak or frivolous. I refer to these misogynistic boys as an anti-feminine foil. Perhaps this anti-feminine foil cartoon character corresponds to Adorno’s similar reflections on Disney’s popular cartoon character Donald Duck, whose slapstick violence and mishaps were viewed by Adorno as examples of mass man’s willingness to accept the inequalities of capitalism. He writes, “Donald Duck, like the unfortunate in real life, gets a thrashing so that the viewer can get used to the same treatment” (Adorno and Horkheimer 1997:138). In girl cartoons, the anti-feminine foil is a girl’s reminder of the sexism that she faces in daily life, and also a reminder of how she can outsmart it.

Truly deserving the title of girl power, Lulu, in several cartoons, is tricked by scoundrel men who ply her with false promises, offers of lollipops from a golfer who hires her as his caddy in “Cad and Caddy” or the photographer offering to take her photograph once she pays twenty-five cents in “Snap Happy”. She rectifies the matter with a fecund imagination full of cartoon scenarios worthy of any avant-garde expressionist as she proceeds to torment the men in simple pursuit of said promises. “Will you take my picture now mister?” she exclaims, posing in front of all his shots until he fulfills his promise. “Where’s my lollipop?” precedes a series of cunning pranks preventing the golfer’s ball from reaching the hole. Throughout these scenes, though she is intentionally upsetting these men, delivering her punishments with deadpan authority, her acts of mischief are depicted more as innovative creativity than rebellion.

Much like the consideration towards mass culture as a mass manipulator intended on indoctrinating the masses into subservience to the system of consumer capitalism (Adorno 1972, Clark 1990), girls are generally presented as fragile and innocent, willing usurpers of dominate cultural works (Walkerdine 1997, Fritzsche 2004). Cartoon character Little Lulu is a direct challenge to these socially constructed gender norms. As stated in key audience studies, media consumption cannot be seen as an isolated process of encoding, but should be examined as a phenomenon embedded in daily life (Ang 1996, Morely 2000). The traditional feminist critique of girl cartoons is that girl characters are represented as dependent on boy characters or portrayed in hyper-feminized settings (Albiniak 2001, Thompson and Zebrinos 1997, 1995, Signorielli 1993, 1990); because of her “fresh stunts” Little Lulu demonstrates a girl cartoon as a popular culture media product that indeed does subvert normative gender codes; a girl in power, sans sexualization.

However, Little Lulu’s empowered presentation was not an emphatic statement of girl power, in fact it wasn’t a statement at all. Buell’s son reported in an interview:

[My mother] didn’t think of Lulu as a part of politics. She drew a line between entertainment and didacticism.” Nor did Marge welcome the idea of introducing feminist themes into the cartoon. She preferred to let the character’s actions speak for themselves. “She created this feisty little girl character who held her own against the guys and frequently outwitted them, but she didn’t want to turn the cartoon into a message. She agreed with Samuel Goldwyn’s slogan, ‘If you want to send a message, try Western Union’ (Gewertz 2006:2).

Lulu’s mischief involves a sense of self-confidence and wit. This self-motivated mischief is generally associated as a boy characteristic, as in “boys will be boys.” However Lulu is not a boy, she is very much a girl, willful and confident, a good role model for girl cartoons and, Buell thought, for young girls (Gewertz 2006). The Little Lulu cartoon was playfully transgressing the normative codes created to define little girls. It would be almost thirty years before another girl was presented as a lead character in a television cartoon.

Toys, Cartoons and the FCC

In 1969, under the FCC guidelines of network self-regulation, the ABC network broadcast Hot Wheels 1969-1971; a cartoon program named after a Mattel brand of toy cars (Owen 1988). This was the first example of product-based cartoon programming, a show developed around a line of preexisting children’s toy product.Public concerns were promptly raised to the FCC against Mattel’s “half-hour commercial” Hot Wheels and the overcommercialization in children’s television. The concerns fell on deaf ears.

Mattel's Hot Wheels cars.

Mattel’s Hot Wheels cars.

While activist groups like the parent-run Action for Children’s Television were trivialized, the FCC did respond, however, to a financial claim made by a rival toy manufacturer who asserted that Mattel’s Hot Wheels show be recognized as an advertisement, not programming, and be financially coded by the network as such. Motivated by the competitor’s claims, the FCC mandated the ABC network to code the Hot Wheels program as advertising time for Mattel, far more expensive airtime than regular programming. Rendering Hot Wheels airtime too costly, it was no longer profitable for Mattel and the show was quickly cancelled (Owen 1988, Mittell 2003). Promoting industry self-regulation, the FCC issued a vague warning advising networks against further product-based cartoon programming (Schneider 1989)

The Hot Wheels television show (1979-80).

The Hot Wheels television show (1979-80).

Feeling that the broadcasters lacked compliance in self-regulation, Action for Children’s Television continued to petitioned Congress and eventually got the FCC to issue a Report and Policy Statement in 1974 suggesting that broadcasters have a special obligation to serve children (Kunkel 1998). As a result, the amount of advertisement time allowed during children’s programming was limited, slashing their budgets and new cartoon programming with it (Lisosky 2001).

The 1970s were a transitional period for children’s television cartoons, and much more so for girl cartoon characters. Though the socio/cultural era was ripe for cartoon programming to move away from recycled theatrical cartoons and produce new stylistic cartoons specifically for television, budget constraints restricted the development of original ideas or new animation techniques; there would be no girl cartoons during this era.

Girl cartoons would have been a risk for the networks, compounded by their fear that any new cartoons, particularly a girl cartoon, may not be commercially successful with the viewers. Cartoon producers and networks played it safe by imitating past successes, cartoons where the girl characters were secondary to the boy leads; the networks did not experiment with the new concept of a lead girl character. This aspect of self-censorship, in the form of playing it safe by using boy characters as the default setting, is used to support the claims that television is a hegemonic replicator because it is producing mediocre programming so as to please the majority (Bourdieu 1998, Friske 1987). The cartoon industry’s practice of using boy characters as the default setting was their way of playing it safe.

 Television animation producer, Herb Klynn (Alvin and the Chipmunks), lamented the networks’ reluctance towards testing new concepts: “We can create so much through animation, but try to show the networks! Most people I brings ideas to have no creative insight at all” (Erickson 2005). Linda Symensky, director of children’s programming and various animation media, commented on the nature of cartoon programming production, “risk taking, scary as it is, is crucial to the advancement of the animated medium on television. The more risks you take, the more often you will end up with unusable material. But there is also a greater chance for success” (Simensky 2004:101). Where Klynn and Simensky’s laments were in reaction to the networks’ resistance towards general animation innovation, a more direct blockade was set against the development of girl characters.

Producer Cy Schneider was considered an authority on children’s television after his financial success with producing Mattel’s Hot Wheels programming. His positions on gender and racial diversity in children’s television were representative of the pervasive sentiment in the male dominated industry. In his book on children’s television, he writes about programming selection with an argument that demonstrates both a racial and gender bias,

The temptation is always to show the latest in styles, music, and dancing. Inexperienced young creative people…often forget that rapping and break dancing might go over well in Los Angeles and New York, but in Iowa the freckle-faced kids are still down at the soda fountain getting a sundae or out playing Little League baseball (Schneider 1989:108).

More overtly in regards to gender, he asserts:

Don’t show an eight year-old boy playing with an eight year-old girl. For boys, that’s an unreal situation. Girls will emulate boys, but boys will not emulate girls. When in doubt, use boys (Schneider 1989:107).

In cartoon programming, and children’s television in general, the industry’s standard belief was that girls would watch boys’ shows but boys would not watch girls’ shows, therefore investing exclusively in the programming of boy-dominated cartoons (Seiter and Mayor 2004).

In the interest of obtaining advertising sponsors, the industry created the gender biased belief of children’s viewing habits. Arguments that boys watched television programming more than girls were not taking into account that there were no programs for the girls to watch because boy characters were always ensured the lead role. Girls watched boys’ cartoons because that was all that was available (Seiter 1993).

Media scholar Ien Ang has argued against the pre-constituted audience body that can be defined or measured, partly because it does not take into account how the viewer interprets programming. According to Ang the audience is “an abstraction constructed from the vantage point of the institutions, in the interest of the institutions” (Ang 2:1991). Boy cartoon programming was designated for children programming specifically because its airtime was believed to be profitable for advertising children’s products resulting in the creation of a market by and for the interests of the market itself. Advertisers concentrated their dollars onto boy-centered cartoon programming because that was what existed.

1980s Reagan Era FCC Deregulation

‘If you can’t self-regulate, then de-regulate’ could have been the catch phrase of the pro-business Reagan-era FCC chairman Mark Fowler who ushered in a laissez-faire climate towards policy enforcement. He stated that television was a “toaster with pictures” (Engelhart 1986:76); an entertainment business with no obligation towards public service.  Television broadcasters were deregulated and allowed to rely on the marketplace to decide which children’s shows would be aired. Opponents argued that the deregulation that occurred in the 1980s violated key parts of the Communications Act of 1934, especially the requirement to operate in the public interest, and allowed broadcasters to seek profits with little public service programming required in return. The main deregulations critiqued were the elimination of the Fairness Doctrine, the extension of television licenses, (the number of years the license is granted), and the expansion of the number of television stations any single entity could own (Hendershot 1998). The concentration of media ownership nationwide went from 50 owners in 1984 to 26 major owners in 1987[7] (Bagdikian 2004). Two specific deregulatory initiatives affecting children’s television emerged: abolishing guidelines for minimal amounts of educational programming on networks, and dropping FCC license guidelines for how much advertising could be carried during children’s programming (Hendershot 1998).

The lack of educational programming on commercial networks in the early 1980s was defended by the FCC on the basis that public television was sufficient to serve children’s educational television needs. Public television had been a primary provider of children’s educational programming since the late 1960s, and the FCC sought a way to codify public television broadcasting as a supplement to commercial television, thus relieving commercial broadcasters of their responsibility to serve the educational needs of their young audience through commercial educational programming (Lisosky 2001).

The Reagan-era FCC’s emphasis on commercialization let networks determine the amount of advertisement presented during programming. This opened up the airwaves to the rebirth of the product-based cartoon, taken off the air after Mattel’s Hot Wheels in the early 1970s. Deregulation ushered in a new era in children’s programming, the toy-based genre and with it the introduction of girl cartoons.

Toy-Based Cartoons, A New Era for Girls’ Media

In 1977, Bernard Loomis, president of toy manufacturer Kenner, signed a licensing contract with Twentieth-Century Fox to produce the toy line for its upcoming movie Star Wars (Owen 1988, Hendershot 1998); Kenner had unknowingly landed the number one selling toys for 1978 and years to come. Hoping lightning would strike twice, Loomis began looking for a toy line Kenner could own from inception, not merely as licensing contractors. Loomis also wanted Kenner to focus on creating an entire line of toys rather than individual products. He soon found his next star; created by artist Muriel Fahrion, an illustrator in American Greeting Cards’ juvenile department, a little girl character named Strawberry Shortcake would soon air in her own syndicated television special[8] The World of Strawberry Shortcake 1980 (Woolery 1983, Lenburg 2009).

The World of Strawberry Shortcake, cartoon.

The World of Strawberry Shortcake, cartoon.

The World of Strawberry Shortcake produced by Kenner, aired once as a syndicated special in March-April of 1980 across different television stations. It told the adventure of six year-old girl, Strawberry Shortcake, and friends with similar fruit-based names like Apple Dumplin’ and Raspberry Tart, who live in the very colorful Strawberry Land. “Who sleeps all night in a cake made of strawberries, lives and plays in a cake made of strawberries… It’s Strawberry Shortcake, wouldn’t you know” (“The World of Strawberry Shortcake”). The dialogue was as simple as the plot; the kids laugh and play in the garden until their fun is spoiled by the villainous Purple Pie Man, an adult who wants to steal their fruit to make his pies. In the end, the kids of Strawberry Land win out over his conniving (Lenburg 2009).

The airing of the special was shortly followed by the release of a wide range of Kenner toy products. Within its first year the Strawberry Shortcake line had grossed over $100 million in profits (Engelhart 1986), prompting subsequent yearly specials, airing one night a year from 1981-1985 (Woolery 1989). Strawberry Shortcake’s financial success secured that there was profit in producing cartoons featuring a girl lead character. It was this drive for profit that created the opportunity for girl cartoons to exist.

Rainbow Brite toys.

Rainbow Brite toys.

Toy-based cartoons were about to make a new entrance into regular children’s television programming. After the success of the Strawberry Shortcake television specials, NBC became the first network to directly violate the previous regulation against product-based programming with the appearance of a hit NBC Saturday morning cartoon by Hanna-Barbera, The Smurfs 1981-1990. Under the new FCC regulation these toy-based cartoons were acceptable because there was no direct product endorsement (Hendershot 1998). In essence a half-hour cartoon program based on a pre-existing toy, in this case The Smurfs, was permissible within the regulations provided that there were no Smurfs toy advertisements during its broadcast airtime (Erickson 2005, Kunkel 1988). It was perfectly acceptable if the Smurfs toys were advertised at a different timeslot promoting their toys bearing the same name. What the toy manufacturers hoped for and soon discovered to be correct, was that there would be no need to spend on advertisement at all; the shows, essentially program-length commercials, were promotional on their own.  When The Smurfs and deregulation went unchallenged, toy-based cartoons began proliferating nationwide not just as television specials but as regularly scheduled, daily cartoon programming.

A successful toy product meant exposure for the show, which in turn created desirable advertisement time slots; it was a win situation for the programmers. Because the amount of advertising time per show no longer had limitations in the deregulated environment of the 1980s, television stations reaped the advertising dollars of extended, multiple commercials. In addition to that financial gain, the television stations acquired the cartoons at little to no cost.  Since most of these cartoons were aired in syndication, they were not produced in-house by the networks’ own animation studio. Instead, they were produced by outside independent studios financed by the manufacturer of the toy that the cartoon was based on. The entire program series was sold as a complete set to individual stations for cash and/or advertising time. The station in turn received inexpensive or free programming and, due to the licensing success of the toy, sold its advertising timeslots at higher rates (Erickson 2005).

With the intention of promoting sales, rather than artistic production, entire program series were made quickly and cheaply with weak dialogue, poor animation quality and little or no character development (Lenburg 2009); quantity over quality was the new cartoon production value. Artist-driven cartoons, created by individual artists who concentrated on their animation, such as Bugs Bunny or Pink Panther, were viewed as expensive to produce. In the effort to continuously shave production costs, networks began broadcasting toy-based cartoon series that had been produced all at once. These cartoon productions were eagerly financed by toy manufacturers because they gave them something they wanted, the elusive year-round toy sales (Owen 1988). The manufacturers’ goal of promoting toys through cartoons succeeded with millions of dollars in merchandise sales for all the individual shows (Engelhart 1986).

Product Positioning Fantasy Play: The New Cartoon

While The World of Strawberry Shortcake was aimed at a girl audience, it was a television special, meaning it only aired once a year. Though Little Lulu cartoons were televised in the 1950s, they were created as theatrical cartoons which were then recycled into syndicated television. The very first made-for-television, regularly broadcasted girl cartoon program appeared in 1984, the toy-based Rainbow Brite - many would soon follow.

The Rainbow Brite tv series.

The Rainbow Brite tv series.

Since toy manufacturers marketed toys according to binary gender coding, the toy-based cartoons were then also marketed according to the binary gender code as ‘girl cartoons’ and ‘boy cartoons'; Mattel’s Rainbow Brite 1984, Kenner’s CareBears 1985 and Hasbro’s My Little Pony 1986 were examples of girl cartoons, while Hasbro’s GI Joe 1985, Mattel’s HeMan and the Masters of the Universe 1983 and Hasbro’s Transformers 1984 were examples of boy cartoons (Lenburg 2009).

HeMan and Skeletor toys.

HeMan and Skeletor toys.

These toy-based cartoons were produced to create product positioning fantasy play. In essence, the cartoon program would create the fantasy world in which a toy lived. Boys’ action cartoons had warriors, soldiers or authority figures equipped with gadgetry and weapons to fight villains with the aid of strong allies, vehicles and occasional beasts. They were premised on good vs. evil, and while the evil never wins, they often escape to fight another day. Each boy cartoons hero had a cartoon villain: Mattel’s He-Man battled Skeletor, Hasbro’s G.I. Joe battled Cobra and Hasbro’s Transformer Autobots battled the Transformer Decepticons. Each villain had their own force of allies, adventure equipment and arsenals. The profit for the boys’ toy industry derived from these extensive armed forces of gadgets and weapons referenced in the cartoon’s world.

Following the successful model of Strawberry Shortcake and The Smurfs[9] friendship communities, the girl cartoons were centered around adventures laden with lessons of friendship and caring, self-doubt overcome with pep talks and challenges resolved with teamwork. These toy-based girl cartoons were created and written almost exclusively by men whose notions of gender were translated into the programming. They established the television industry parameters of what determined a girl cartoon and with it, the cultural indicators of the new girl media genre. These definitions relied on, as much as they created, gender normative coding, such as excessive use of rainbows, ponies and the color pink as well as didactic storylines laden with self-deprecating dialogue. Characters remarking that they are not strong enough or brave enough would receive encouragement like My Little Pony‘s “You can do it if you try” (“Escape from Catrina”) or Rainbow Brite‘s “I know you can, I believe in you” (“Invasion of Rainbowland”). All 1980s girl cartoons emphasized these self-conscious critiques countered by their peers’ emotional and motivational support.

The World of Rainbow Unicorns and Motivational Leaders

The industry term for the pink worlds the girl cartoons were centered on was “cooperation villages” (Hendershot 2004); self-conscious characters living together and helping one another learn life lessons. The magical ponies of Hasbro’s My Little Pony lived in the colorful Paradise Estates located in Ponyland, Mattel’s Rainbow Brite and friends lived in Rainbow Land and Kenner’s Care Bears lived in the clouds in the Kingdom of Care-a-Lot; all lands were complete with smiling stars and cheerful rainbows. Cultural scholar Esther Leslie points out in her analysis of animation that “animals are children’s willing helpers in the cartoon world, just as they are in the fairy-tales” (Leslie 2002:24). These magical lands were often inhabited by little creature friends who performed basic labor jobs ranging from gathering color stars or harvesting the gardens; the little ponies played with the bushwoolies, the Color Kids teamed with the sprites. The little friends were as helpless as they were helpful. Quite often the critters fell into peril and needed to be rescued by one of the girl characters, providing the girl characters a set role of protective caretaking and guidance.

Heartthrob the cartoon character.

Heartthrob the My Little Pony cartoon character.

Hearthrob the toy.

Hasbro’s ‘Hearthrob’ the My Little Pony toy.

Lacking the arsenal of toys created by the use of weaponry and gadgetry accessible in the boy cartoon programs, the cooperation villages setting created a context that required the purchase of multiple dolls to interact and replicate the stories in the product-placement fantasy of girl cartoon programming, and it did so quite successfully; 150 million little ponies and over 40 million Care Bears were sold between 1983 and 1987 (Erickson 2005). Each of the pastel-colored Care Bears was named to correspond to a feeling, such as Grumpy Bear, Tenderheart Bear or Wishing Bear. The pastel-colored ponies had rainbow-colored manes and icons on their hind quarters demonstrating if they were flying pegasus ponies like Heart Throb, Paradise and Lofty, horned unicorn ponies like Ribbons, Buttons and Fizzy, mermaid sea ponies like Sunshower and Water Lily or earth ponies like Posey, Magic Star and Lickety-Split, all with their own magical power. The dolls relied on communication and teamwork. Upon market introduction in 1983 Hasbro sold $25 million worth of pony toys; with the media release of My Little Pony cartoons, that figure rose to over $100 million in 1985 (Engelhart 1986). In terms of commercialism, exchanging feelings along with accessories and the occasional magical charms made for a very profitable girls’ toy market.

When a villain confronted a character, the boy cartoons’ plot often revolved around combative battle and violent conquest; G.I. Joe soldiers used advanced weaponry to fight Cobra agents, the Autobots would pound and slice metal on metal against the Decepticons while He-Man would often physically pick up his villains and throw them. The girl cartoons’ villains were more often captured than attacked, and the characters used teamwork and encouragement instead of weapons or violence (Woolery 1983, Hendershot 1998). In a My Little Pony episode, a newly allied worker bee says to Meagan, “You can’t talk to the queen, she’s too mean to listen.” Meagan replies, “I have to. We have to try to find the good in everyone” (“The End of Flutter Valley”). Girl toons were generally a violence-free rescue adventure with conflict-resolution scenarios involving kind words for a tearful character that had caused trouble. If a member of the cooperation village traveled outside the safe boundaries of their home there were usually unpleasant or dangerous circumstances that required rescuing and then an apology from the misguided member for wandering alone. Little Pony Shady says, “Maybe if I hadn’t been so overly sensitive I could have helped the other ponies get away [from the kidnappers]. Now not only am I useless, I’m a deserter besides.” This self-deprecation is followed by tears and crying that naturally leads to song, “I’m all wrong, all wrong, I’m a klutz and I don’t belong.”  Five year-old Molly, the human friend of the ponies, is there to comfort Shady, in song of course, “No one in the world is perfect, you are not all wrong, you are all right” (“The Glass Princess”). By the end of the episode, Shady’s mea culpa is resolved with Molly’s emotional-support and the kidnapping conflicts are resolved with a moralistic lesson of friendship and sharing from lead pony Magic Star.

Whereas boy cartoons offered action battles and explorations, cooperation village girl cartoons centered on personal dynamics within the community and keeping the home safe and happy. Children’s culture critic Cathleen Schine considered them to be an antithesis of adventure, “instead of being about journeys into the world, they are, by definition, conservative: they are about keeping the world at bay, about limits and defending those limits.” (Schine 1988:6).

In Sold Separately, her book on children in consumer culture, Ellen Seiter writes about how her local video store stopped carrying Rainbow Brite because even though kids loved it, too many parents were complaining about it.  She mused that perhaps middle-class parents were offended by the excessive use of pink and the kitschiness of the cartoon’s design perhaps because of their own distaste for the leanings that mass-marketed media represents working-class aesthetics and gendered sensibilities (Seiter 1993). These toy-based girl cartoons were widely critiqued by pundits and parents alike (Owen 1988, Signorielli 1990), and with good reason since the plots were formulaic with equally bad animation and dialogue.  No one seemed to like them except the children viewers who responded enthusiastically with millions of dollars in product purchases (Engelhart 1986, Seiter 1993).

This direct relationship between toy and cartoon not only increased the toy’s sales, it also increased the social coding of cartoons as children’s programming. Perhaps because of the simplified dialogue and storylines or the unlikelihood of adults playing with children’s toys, these cartoons were watched predominantly by children. Unlike cartoons in the past era, like Bugs Bunny or Mickey Mouse, which had been enjoyed and even targeted at adult audiences as well as children, the cheaply animated and poorly written toy-based cartoons were really just for kids- and some were really just for the girls.

A Room of One’s Own, On Television

As these girl cartoons were being criticized by adults for their hyper-feminine appearance, girl viewers were making their own interpretations (Walkerdine 1997). Within these standard gendered parameters the girl protagonists in these cartoons were strong, responsible and leaders. These toy-based girl cartoons created an empowered space for little girl viewers that previously had not existed, albeit a heavily commercialized and gendered one (Seiter 1993).

As stated in key audience studies, media consumption cannot be seen as an isolated process of encoding, but should be examined as a phenomenon embedded in daily life (Ang 1996 Morely 2000).  Different studies show that the relationship girls have with the cultural products they consume is an active one (Inness 1998, Weeks 2004). Girls are just as capable as other fans to take from pop culture what relates to them and discard what appears to be irrelevant or derogatory (Walkerdine 1997). They can select material from the main discourse and find strength in it; they can find its ‘girl power’. Exemplified physically through their play with the cartoon toys, the vast range of potential interpretation and application of the ‘girl power’ message in shows like Rainbow Brite or My Little Pony allowed girls to use the cartoons’ media image as they saw fit in pursuing their own empowerment goals.

Though the creation of these cartoons was to increase toy consumption by little girls, it inadvertently and without intention created an empowering space for little girls to see themselves as heroes. This new space to television, girl cartoons, was a representation of the non-violent, communicative, pink world of what girl aesthetics should be, and what this world provided was a “room of one’s own” for little girls on network television. In the spirit of Virginia Woolf’s identification of a space for women to retain a sense of their own identity, “a room of one’s own” was created with the girl cartoons of the 1980s.

These cartoon girl protagonists represented girl characters that displayed a strength that had not traditionally been attributed to girls. The traditional gender presentation, as well as the traditional feminist critique, was that girl characters were secondary and represented as dependent on a boy character (Albiniak 2001, Thompson and Zebrinos 1997, 1995, Signorielli 1993, 1990). In contrast, the representation of feminine strength in the girl characters of the 1980s cartoons countered the traditional gendered traits associated with little girls. The protagonist were empowered girls with determination and leadership skills, something that had been missing in cartoon television since Little Lulu. The excessive use of pink stars and rainbow skies meant designated girl leaders.

Aged eleven and under, these cartoon girls were represented in ways that subvert traditional norms of who little girls are and what they do. Within the heavily gendered normative message, the feature of lead girl characters created a counter-hegemonic message of gender independence alongside its creation of a successful girls market. Shows like Rainbow Brite provided a space for girls to have as their own, with no boy prince to rescue them, no boy hero to be a sidekick for, and where the protagonist, and consequently the hero, was a girl. These girl cartoons did, however, have boy characters; Huckleberry Pie lived in Strawberry Land, Red Butler and Buddy Blue were part of the Color Kids who lived in Rainbow Land, and there were boy Care Bears in Care-A-Lot as well as boy ponies in Ponyland. Perhaps because of the industry party line that cartoons with girl leads could not be successful, boy characters were included in all the girl shows, though the same was not true in reverse. The boy cartoons at times had a woman character, but a girl in the boy cartoons was rarely seen. The exception to this was the cartoon Inspector Gadget 1983-1986 and the detective’s precocious niece and lead character, Penny.

Created by DIC Entertainment, Inspector Gadget 1983-1986 was about a bumbling, simple-witted detective who fights crime using his cyborg-like gadgets. There were no genre demarcations of a girls’ cartoon, no rainbows, no cute animals, no magic; stylistically, Inspector Gadget was a boy’s cartoon. The plot line usually follows the same format; Gadget is given a top-secret assignment and proceeds to either mistake villains for allies or simply go on an unrelated trail. Since clever Penny is always skeptical of these so-called allies, suspecting them to be villainous agents, she sends Brain, her dog and crime-fighting partner, to follow and protect her Uncle Gadget while she formulates a way to prevent disaster and solve the crime. Years before the proliferation of laptops or cell phones, Penny uses her computer book to break codes, conduct surveillance and keep tabs on Gadget. She also uses her wristwatch as a communicator, laser beam and occasional remote control over menacing vehicles or destructive machines. These tech-savvy characteristics, paired with her resourceful detective skills are a playful transgression to normative gender coding since they are more commonly attributed to boy characters, or nerdy teenage girls, like Selma on Scooby-Doo, who often need to be rescued. On the Inspector Gadget cartoon, it was Penny who did the rescuing.

Penny with Inspector Gadget.

Penny with Inspector Gadget.

While the show is named after Gadget, he is the program’s comic relief, while Penny is the serious character, always aware of peril and taking risks to solve the crimes and capture the culprits. In his absentminded adventures, Gadget fails to recognize the far superior intellectual abilities of his niece. In each episode Penny is the one who solves the crimes while Gadget is distracted and detained by the M.A.D. agents of the villainous Dr. Claw and his pet cat[10]. At the end of each episode, police chief Quimby gives Gadget the recognition for solving the case. One could muse that Penny is the classic representation of the cliché “behind every great man is a great woman”, whereby the woman toils and does the work while the man gets the credit. In Penny’s case, even Gadget himself is unaware that she is actually the great detective. She works tirelessly and puts herself at risk, all unknown to Gadget, while in the end Gadget clumsily stumbles upon a solved crime and is given credit for its resolve as Penny looks on in amusement. As a strong girl character, both in identity and plot importance, Penny, effectively demonstrated that boys would easily watch an empowered girl character.

Inspector Gadget was DIC Entertainment’s first television cartoon and an artist-driven program, preceding DIC’s eventual turn to cheap, mechanical cartoons. DIC soon followed Inspector Gadget with thirty-two different cartoon programs in the 1980s that had their entire series produced at once, some with over one hundred episodes made in a single year. One of these mass produced programs was girl cartoon Rainbow Brite 1984-1986.

The introduction of girl cartoons into children’s television media culture spurred an unprecedented commercial movement of merchandise. Rainbow Brite, was originally a greeting card icon created by Hallmark. With the advantage of deregulated children’s television, toy manufacturer Mattel contracted DIC Entertainment to animate the Hallmark character and create a cartoon series they could sell in syndication, what followed was an explosion of rainbow success. The Rainbow Brite franchise generated $1 billion in retail sales of dolls, toys, cereal and other licensed products throughout the 1980s.[11] Much like her girl cartoon predecessor Little Lulu, Rainbow Brite spurred a merchandising empire that is still viable today.

Rainbow Brite’s bias for heroic and direct action was a characteristic also attributed to Little Lulu, they both would act to ensure the safety of smaller children or animals in need of rescue. However, unlike Little Lulu, Rainbow Brite was neither cunning nor mischievous; the serendipitous Rainbow Brite was the new girl cartoon role model. Rainbow Brite looks like a cartoon version of a child beauty pageant contestant. Her rosy cheeks are accentuated by long blond hair in a high bouffant. She wears rainbow colored moon boots and a miniskirt with a fluffy white trim. Yet contrary to the expectations associated with this sweet, hyper-feminine appearance, she is a fearless little girl who is also a well-respected, resourceful leader, battling evil, unafraid and triumphant; she is the 1980s power girl.

The Rainbow Brite series begins with her arrival to a dark land, an unseen benevolent woman spirit brings her there by magic. We know magic is at work here because of the visual and audio cues of star sparkles and a harp glissando. Both of these cues had been used extensively by Looney Tunes yet they were demarcations of violence, such as being hit on the head with an anvil. Rainbow Brite effectively appropriated these audio and visual cues as the new girl cartoon signifiers of magic and happiness, a trend that continues today. In the pilot episode, a shooting blue star magically transforms into Rainbow Brite as she arrives to the dark, thunderous land. An omnipotent woman’s voice asks, “Still want to save this world?” “Yes!” Rainbow Brite emphatically replies, “It’s even worse close up.”  The women the says, “Find the spear of light and the color of this land and set it free, and the darkness will disappear.” (“Beginning of Rainbowland”) In this introductory episode, not only is this feminine girl a heroic leader, the all-knowing guardian entity responsible for bringing her there is a woman. Strawberry Shortcake, Rainbow Brite and Meagan in the My Little Pony cartoons make no mention of their parents. They simply arrive in these magical lands to help the residents battle villains and reclaim their homes. “What [Rainbow’s] mom and dad thought of her mysterious disappearance…weren’t mentioned. But the story was aimed at very young children, who tend not to ascribe much weight to such consideration” (Markstein 2003:1). Walkerdine points out in her analysis of young girls in 1980s popular culture texts that “it was amazing just how many of the stories presented the heroines as either not having parents, or not living with them” (Walkerdine 1997:47). This lack of adults was more present in the girl cartoons than in the boy cartoons, and as a result meant that girls were the defacto leaders.

In the Rainbow Brite cartoon, the cooperation village of Rainbowland is full of multi-colored homes and sparkling paths. Equally vibrant are the inhabitants, little fuzzy multihued Sprite and the Color Kids, each represented in a corresponding color with the boys, Red Butler and Buddy Blue, taking the traditional primary boy colors. Together they harvest and produce color stars which power up the magic color belt Rainbow Brite uses to awaken the dismal, colorless areas overtaken by their grey nemesis named Murkel. Riding upon Starlight, her large white stallion with a rainbow mane, Rainbow Brite travels to bring color and rainbows to all lands of the universe. You will not find any guns or swords in these brigades. Under Rainbow Brite’s motivational guidance the Color Kids and Sprites use teamwork to fight battles.

The color kids and the sprites look to Rainbow for help in resolving their conflicts. Rainbow Brite offers her friends emotional support while also engaging in the defense of Color Land. “I have to save them, you don’t have to come if you don’t want to.” (“Rainbow Land”). She offers advice and is sought out for advice, she performs as leader and is recognized as leader, by others and self-actualized. She uses her magical powers and challenges her enemies with the same serenity she displays when rescuing her friends from danger (“Peril in the Pits”), or helping a stranger find his way home (“Invasion of Rainbowland”). She offers her friends emotional support while also engaging in combative battle. “We have to go to [to the dark castle] and look for the magic color belt. We have to try, this world is awful, don’t you want it to be beautiful?” When attempting a rescue, Rainbow says to her fearful sprite companion, Twink, “You can make it if you believe you can. Try to believe” (“Beginning of Rainbowland”).

Mean Girls

Much like the teasing Lulu faced from misogynistic boy, feminine-foil Tubby, Rainbow also has to face challenges from boy characters who doubt her leadership capabilities based on her gender. In the Rainbow Brite episode “Star Stealers” Rainbow is beckoned by Onyx, the robot horse, to travel to the Crystal Diamond planet and help his owner Cris save it from the evil princess. After narrowly escaping the giant robots, Onyx informs Cris that he has returned with help, Rainbow Brite. In one sentence, Cris emasculates himself and puts down girls, “The [evil princess and her] gliterbots have everyone hypnotized but me, and that’s only because I run faster than anybody; …this is what you call help? A girl!” (“Star Stealers”).  Cris later makes fun of Starlight, Rainbow’s horse, because he can’t fly like the mechanical Onyx, he says, “That dumb horse of yours can’t help rescue us. He can’t even fly without your color belt.” Rainbow replies, “He can think, which is more than your horse can do” (“Star Stealers”), it is indicative of the struggle that girls and women face when devalued due to physical strength prowess, yet proving themselves through intellectual accomplishments. Cris’ remarks, are intended as the expected routine, to play the sexist game, the “thrashing” Adorno referred to that is expected of boys against girls; yet in the girl cartoon it is the girl that always wins.

Along with the anti-feminine foil boy character that emasculates himself by devaluing feminine gender, this research also found another gender-normative rivalry persistently present in girl cartoons, the mean girl, which I refer to as the feminine-foil. The feminine foil girls are bossy, snobby, bratty and have rivalries with the lead girl character. The feminine foil girl character actively embodies the antithesis of the empowered protagonist. Both foils are used as a representation of gender normativity for which the lead girl character can be comparatively identified as other. In Rainbow Brite “Star Stealers” the evil princess is the feminine foil. The feminine foil was also repeatedly found in episodes of My Little Pony such as the queen bee in “The End of Flutter Valley” and the queen cat in “Escape from Catrina”. As a challenge to normative gender coding, Rainbow represents a girl warrior, unafraid and ready to take heroic action.  The gendered behavior of the feminine foil princess in “Star Stealers” as well as the feminine-foils in My Little Pony, are representational of the diminutive critiques delivered earlier by Cris against Rainbow Brite. These characters are rude, selfish and freely insult those around them. The feminine foil represents a constructed, normative aspect of femininity that can be used to challenge the feminine power of girl characters like Rainbow Brite, who, though incredibly feminine and in a feminized world, is a strong and heroic leader. The feminine foil is the proverbial thorn in the girl’s side; though Rainbow Brite is strong and defies stereotypes, the feminine foil reinforces that those stereotypes are correct. However, like her challenge against Cris’ sexist remarks, Rainbow Brite proves triumphant over the bratty princess, displaying where the feminine strength truly lies- in smiling animals, rainbow sparkles and friendship.

Conclusion

Television cartoons are a uniquely interpretive form. They are a complex combination of social reproduction and conflict and, because as popular culture they are used as material resources in everyday life, may serve simultaneously dominant and marginal interests. They have been a widely misunderstood art form precisely because of their categorization as children’s entertainment; as cultural forms associated with children are commonly marginalized. Girl cartoons present an example of three-dimensional social marginalization: as children’s television, girl’s programming, and as animated cartoons, all under-valued categories of social placement and study. This positioning as a subordinate cultural form may grant girl cartoons the ability to express different viewpoints and ideas from that of the dominant framework. Gender normativity is part of this synthesis of social structure and personal agency.

The 1980s girl cartoon characters displayed leadership, confidence, determination and savvies, creating a new genre of girl empowerment.  The adventures of Rainbow Brite or the Little Ponies were inspiring young girl viewers to be empowered, sans sexualization. This representation of strength in a girl character serves to counter the themes historically used to construct little girls’ identity, such as romance, peer rivalry, and gendered self-deprecation. Walter Benjamin attributes to animation “the creation of alternative oppositional cultures” (Durham and Kellner 2006:35); by presenting little girls as leaders, the unique medium of girl cartoons challenges gender normativity, not as an emphatic expression of non-conformity, but by playfully transgressing popular culture’s compulsory gender coding.

 

Notes:


[1] Inspector Gadget is culturally coded as a boy cartoon but is included in this set because of the main character, Penny. Strawberry Shortcake, while being a girl cartoon, was a television special, not a regularly scheduled program, and therefore is not included.

[2]Originating in 1991 in the punk-rock music scene of the Pacific Northwest, the young women fan base of the Riot Grrrl movement quickly spread throughout the US and parts of Europe (France 1993) proliferating the underground feminist publications of zines addressing issues of sexuality, rape, body image and gender inequality within a larger anti-establishment identity (Malkin 1993, Garrison 2000, Fritzsche 2004). The reappropriation of the word girl as grrrl was part of their dismissal of how the mainstream media depicted what a girl should be like. Part of this reappropriation was the reclaiming of a sexual self without abusive objectification. They were reclaiming what it meant to be a girl, and they kicked ass.

[3]The Third Wave Feminist movement intended to deconstruct and question Second Wave Feminism’s dearth of representation outside of white middle-class heterosexuality, focusing on gender oppression’s intersections with the power regimes of race, sexuality and class.

[4] Wonder Woman was on the cartoon “Superfriends”, She-Ra was He-Man’s sister and had her own cartoon “She-Ra, Princess of Power” and Daphne was the girlfriend of Fred on the cartoon “Scooby-Doo”.

[5] Author’s term for the sexual objectification of girls’ bodies.

[6] …and friendship is magic.

[7] The 26 owners from 1987 went down to 10 in 1996 and down to 6 major owners in 2004. They were: Time Warner, Disney, Murdoch’s News Corporation, Bertelsmann AG, Viacom and U.S. General Electric. (Bagdikian 2004)

[8] A television special airs once, making it different from a regularly scheduled television program. A syndicated program is purchased and aired individually by stations rather than televised nationally by a network.

[9] The Smurfs was a rare cartoon intended both for the boy and girl audience. While it had the formulaic girl cartoon plot, it had a token girl character in a gang of boys.

[10] Dr. Claw and his pet cat are a parody of the James Bond 007 films’ evil genius character, Ernst Stavro Blofeld, also known as Number 1, whose SPECTRE agents are the basis for the M.A.D. agents. Inspector Gadget himself is a parody of live-action TV program Get Smart and voiced by the same actor.

[11] http://unitedmedialicensing.typepad.com, accessed May 19, 2008.

 

Bibliography

Adorno, T.W., and Max Horkheimer. 1997. Dialectic of Enlightenment. London: Verso.

Adorno, T.W., and Max Horkheimer. 1972. “The Cultural Industry: Enlightenment as Mass Deception.” In Dialectic of Enlightenment, 120-167. New York: Herder and Herder

Albiniak, Paige. 2001. “Oh, It’s Really Not So Bad.” Broadcasting & Cable 131:4.

Ang, Ien. 1991. Desperately Seeking the Audience. London and New York: Routledge.

_____. 1996. Living Room Wars: Rethinking media audiences for a postmodern world. London and New York: Routledge.

Bagdikian, Ben H. 2004. The New Media Monopoly. Boston: Beacon

Bourdieu, Pierre. 1998. On Television. New York: New Press.

Budgeon, S. 2001. “Emergent Feminist(?) Identities: Young Women and the Practise of Micropolitics” The European Journal of Women’s Studies 8(1):7-28.

Clarke, John. 1990. “Pessimism versus Populism: The Problematic Politics of Popular Culture.” In For Fun and Profit: The Transformation of Leisure into Consumption, edited by Richard Butsch, 28-44. Philadelphia: Temple University Press.

Davies, Marie Messenger. 1997. Fake, Fact, and Fantasy: Children’s Interpretations of Television Reality. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Durham, Meenakshi Gigi and Douglas M. Kellner. eds. 2006. Media and Cultural Studies: Key Works. Oxford: Blackwell Publishings.

Engelhart, Tom. 1986. “The Shortcake Strategy.” In Watching Television, 68-110. Edited by Todd Gitlin. New York: Pantheon.

Erickson, Hall. 2005. Television Cartoon Shows: An Illustrated Encyclopedia, 1949-2003. London: McFarland & Company, Inc.

Evanier, Mark. 2007. “Little Lulu.” News From Me. Accessed March 2007. http://www.newsfromme.com/archives/2007_04_09.html.

Fiske, John. 1982. Introduction to Communication Studies. London and New York: Methuen.

______. 1987. Television Culture. London and New York: Methuen.

______. 1996. “The Codes of Television.” In Media Studies, 220-230. Edited by Maris, Paul and Sue Thornham. Edinburgh: Edinburgh University Press.

Fritzsche, Bettina. 2004. “Spicy Strategies: Pop Feminism and Other Empowerments in Girl Culture.” In All About the Girl: Culture, Power, and Identity, 155-162. Edited by Anita Harris. New York and London: Routledge.

Garrison , Ednie Kaeh. 2000. “U.S. Feminism-Grrrl Style! Youth (Sub)Cultures and the Technologics of the Third Wave.” Feminist Studies 26(1):141 -172.

Gewertz, Ken. 2006. “Little Lulu comes to Harvard.” Harvard University Gazette 11(2).

Giroux, Henry. 2000. Stealing Innocence: Youth, Corporate Power, and the Politics of Culture. New York: St. Martin’s Press.

Gonick, Marnina. 2006. “Between ‘Girl Power’ and ‘Reviving Ophelia’: Constituting the Neoliberal Girl Subject.” NWSA Journal 18(2):1-23.

Hall, Stuart. 1981. “Notes on Deconstructing the Popular.” In People’s History and Socialist Theory, 216-226. London: Routledge.

________. 2002. “Encoding/Decoding.” In Culture, Media, Language, 128-138. Edited by Stuart Hall and Dorothy Hobson, Andrew Lowe and Paul Willis. London and New York: Routledge.

Harris, Anita, ed. 2004. All About the Girl: Culture, Power, and Identity. New York and London: Routledge.

Hendershot, Heather. 1998. Saturday morning censors: television regulation before the V-chip. Durham: Duke University Press.

________, ed. 2004. Nickelodeon Nation: The History, Politics, and Economics of America’s Only TV Channel for Kids. New York: New York University Press.

Jacob, Kathryn Allamong. 2006. “Little Lulu Lives Here: From the Marge (Marjorie Henderson Buell) Papers at the Schlesinger Library (Harvard), originally published in the Saturday Evening Post.” Radcliffe Quarterly S(06).

Kunkel, Dale. 1988. “From a Raised Eyebrow to a Turned Back: The FCC and Children’s Product-Related Programming.” Journal of Communication 38(4):90-110.

_______. 1998. “Policy Battles over Defining Children’s Educational Television.” Annals of the American Academy of Political and Social Science. 557:39-53.

Lenburg, Jeff. 2009. The Encyclopedia of Animated Cartoons. New York: Checkmark Books.

Leslie, Esther. 2002. Hollywood Flatlands: Animation, Critical Theory and the Avant-Garde. London, New York: Verso.

Lisosky, Joanne M. 2001. “For all kids sakes: comparing children’s television policy-making in Australia, Canada and the United States.” Media, Culture and Society 23:821-842.

Malkin, Nina. 1993. “It’s a Grrrl Thing” Seventeen, 52 (May): 80-82.

Markstein, Donald. 2003. “Rainbow Brite.” Toonopedia. Accessed March 3, 2007. http://www.toonopedia.com/rainbow.htm.

Mittell, Jason. 2003. “The great saturday morning exile: Scheduling cartoons on television’s periphery in the late 1960s.” In Prime Time Animation: Television Animation and American Culture, 33-54. Edited by Carol A. Stable and Mark Harrison. New York: Routledge.

Morley, David. 1980. The Nationwide Audience: Structure and decoding. London: British Film Institute.

_____. 1992. Television, Audiences and Cultural Studies. London and New York: Routledge.

_____. 2000. Home Territories: Media, Mobility and Identity. London and New York: Routledge.

Owen, David. 1988. The Man Who Invented Saturday Morning: And Other Adventures in American Enterprise. New York: Villard Books.

Schine, Cathleen. 1988. “From Lassie to Pee-Wee.” New York Times, October 30. Accessed May 2, 2008.

Schneider, Cy. 1989. Children’s Television. Lincolnwoon: NTC Business Books.

Seiter, Ellen. 1993. Sold Separately: Children and Parents in Consumer Culture. New Brunswick, New Jersey: Rutgers University Press.

Seiter, Ellen and Vicki Mayer. 2004. “Diversifying Representation in Children’s TV: Nickelodeon’s Model.” In Nickelodeon Nation: The History, Politics, and Economics of America’s Only TV Channel for Kids, 120-133. Edited by Heather Hendershot. New York: New York University Press.

Signorielli, N. 1990. “Children, television, and gender roles: Messages and impact.” Journal of Adolescent Health Care 11:50-58.

_____. 1993. “Television, the portrayal of women, and children’s attitudes.” In Children & Television: Images in a Changing Sociocultural World, 229-242. Edited by Gorden Berry and Joy Keiko Asamen. London: Sage Publications.

Simensky, Linda. 1996. “Women in the Animation Industry” Animation World Network. Accessed March 3, 2007. http://www.awn.com/mag/issue1.2/articles1.2/simensky.html.

_____. 2004. “The Early Days of Nicktoons.” In Nickelodeon Nation: The History, Politics, and Economics of America’s Only TV Channel for Kids, 87-107. Edited by Heather Hendershot. New York: New York University Press.

_____. 2007. “Programming Children’s Television: The PBS Model” Pp.131-146 in The Children’s Television Community. Ed. Bryant, J. Alison. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Taft, Jessica K. 2004. “Girl Power Politics: Pop-Culture Barriers and Organizational Resistance.” 78 In All About the Girl: Culture, Power, and Identity, 69-78. Edited by Anita Harris. New York and London: Routledge.

Thompson, Teresa L., and Eugenia Zerbrinos. 1995. “Gender roles in animated cartoons: has the picture changed in 20 years?” Sex Roles 32(9/10):651-674.

Thompson, Teresa L., and Eugenia Zerbrinos. 1997. “Television cartoons: Do children notice it’s a boy’s world?” Sex Roles. 37(5/6):415-432.

Walkerdine, Valerie. 1997. Daddy’s Girl: Young Girls and Popular Culture. Cambridge, Massachusetts: Harvard University Press.

Weeks, Debbie. 2004. “Where My Girls At? Black Girls and the Construction of the Sexual.” In All About the Girl: Culture, Power, and Identity, 141-153. Edited by Anita Harris. New York and London: Routledge.

Winn, Marie. 1987. The Plug-In Drug: Television, Children, & the Family. New York: Penguin

Woolery, George W. 1983. Children’s Television: The First Thirty-Five Years, 1946-1981 Part I: Animated Cartoon Series. Metuchen, NJ & London: Scarecrow Press.

________. 1989. Animated TV Specials: the complete directory to the first twenty-five years, 1962-1987. Metuchen, NJ & London: Scarecrow Press.

Zaslow, Emilie. 2009. Feminism Inc.: coming of age in girl power media culture. New York: Palgrave McMillan.

 

BIO:  Prof. Katia Perea has a PhD in Sociology from the New School for Social Research specialized in television girl cartoons and popular culture theory. She is a Sociology professor for CUNY- City University New York and spends her spare time placing small toys in odd public locations throughout Brooklyn; a form of 3D graffiti. She is currently working on her book “Girl Cartoons” and an ethnography project on The Bronies.

 

 

In the Eye of the Beholder: Bishounen as Fantasy and Reality – Christy Gibbs

Abstract: Since the international popularisation of anime and manga, the bishounen has been one of Japan’s best recognised archetypal figures. But where did this stereotypical look come from, and is it a purely fictional representation? This paper examines the bishounen not just as he appears on the page or screen, but also how he appears in the international fashion and music scenes, as well as the way in which he influences, and is influenced by, Western versions of himself.

Figure 1: The manga Princess Princess (Mikiyo Tsuda, 2006-7)

In the aesthetically distinct universe of Japanese animation, cultural constructs of gender and sexuality can be complex and challenging to navigate. However, perhaps no archetypal anime figure is as curious to the non-Japanese viewer as much as the bishounen. Best known for his physical attributes – a slender and willowy body shape, artfully arranged hair, narrow and angular features, and often pale, delicate-looking skin – the bishounen is, as the literal translation informs us, a beautiful young man, or ‘pretty boy’. His gender is often ambiguous, his sexual orientation even more so; even for those who watch anime or read manga on a regular basis, it can be difficult to discern whether his relationships with other characters lean towards romance or are merely affectionately platonic.

To those unfamiliar with a wider scope of Japanese popular culture, it is only within this fictive context that the bishounen would appear to exist. From an Anglo-American perspective particularly, it is difficult to take a cartoon character seriously, much less one that wears flamboyant clothing, makes over-the-top arm gestures, and who strikes seemingly casual poses with one hand placed on hip. Cross-dressing is also a common theme for the bishounen-centric manga or anime, for any number of reasons – most of them more on the ridiculous or humorous side, as in the likes of Princess Princess (in which an elite boarding school who elects students to take on the role of Princess in order to in order to break up the monotony of living in an all-male environment) and Gravitation (involving a male pop star who dons a sailor uniform in a ludicrous attempt to appeal to his aloof boyfriend), are nothing if not intentionally absurd. Anime such as Hourou Musuko (Wandering Son), which depict male cross-dressing in relation to any realistic statement of self-identity, are relatively rare.

Moreover, if those same characters were to appear in just about any mainstream American film or cartoon, most would associate their various mannerisms with gay, or at least extremely camp, stereotypes. Western mainstream media is used to viewing stereotypically effeminate male characters through a homosexual lens, and this is certainly sometimes the case in anime, even when a title is not a yaoi or boys love one.[i] Given that Japan is a country where one’s sexual practices are generally understood to be a personal and therefore private matter, it would be perfectly logical to assume that the bishounen is a figure with little, if any, basis in reality.

In fact, such an assumption, however rational, would be a false one. The genre of boys love aside chances of an anime bishounen actually being gay are fairly slim. Anglo-American sexual and cultural limitations would seem to be ‘more threatened by depiction of intense same-sex friendships than does Japanese culture’, commentator Patrick Drazen notes. ‘The reason is that American pop culture often limits its options to “sex” and “not sex.” Japanese culture makes room for a much wider range of relationships’ (Drazen, 2003, p. 103).

Perhaps a more accurate way of approaching the bishounen is to look not at what messages he is (or is not) attempting to convey in terms of sexual orientation within the narrative, but rather to discuss what type of role he fulfils as far as the audience is concerned. In doing so, it appears evident that the bishounen’s job is not to make any sort of explicit statement about his sexuality, but rather to exist as a specific form of eye candy for his largely female demographic; a physical representation of one of the Japanese woman’s ideals of the perfect guy. The bishounen is by his very nature androgynous, and therefore an iconic symbol that has the potential to encompass the strength of traditional masculinity, as well as the grace and beauty of the stereotypically feminine. Regardless of whether anime bishounen are based on real historic figures (Hakuouki Shinsengumi Kitan), re-imaginings of Western stories (Romeo x Juliet), or entirely original characters (The Vision of Escaflowne), they are all therefore given the same intense beautifying treatment.

In contrast, the conventional image of what constitutes an attractive male in much of the West has often been muscular and assertively powerful, evoking perceptions of physical dominance, authority and control, while the attribute of ‘prettiness’ is considered a feminine trait – the opposite of being masculine or ‘manly’. In Japan, however, being pretty does not necessarily mean sacrificing masculinity, and more recently, the West has also seen a growth in this new image of what constitutes male attractiveness. It could be argued that this new masculinity has been influenced since the mid-1990s by the increasing availability of popular and mainstream anime titles such as Sailor Moon, Dragon Ball Z, Pokemon, One Piece, Naruto, and Bleach.

In order to explore how modern conceptions of the bishounen character arose, it is first necessary to pinpoint when and how he came into being within the mainstream culture of his birthplace. The term itself is first found in the Meiji period (1868-1912), where it was used to describe especially beautiful pre-adolescent boys, who were often involved in homosexual relationships (Pflugfelder,  1999, pp. 221–234). Although the word was not in usage prior to this, we might also be reminded of the popularity of Japanese theatre over the previous two centuries, where gender reversal was commonplace.

After women were banned from the kabuki theatre stage in the mid-1600s due primarily to problems arising from prostitution, physically effeminate male performers took on the role of women in their place (Bullough, 1993, p. 242).[ii] Such actors often maintained their dress outside the theatre, compelled to experience first-hand the everyday life, customs, and etiquette of the women they played, many from early childhood. At its height of popularity, some kabuki actors became so sought-after that they became leaders of women’s fashion. While the actors who played women’s roles emulated the manners and dress of high-born ladies, the audience, largely made up of peasants and townspeople, created a space in which performers could potentially become trendsetters in bearing and clothing (Scott, 1999, p. 39).

The temptation to view men who cross-dressed as a part of their art outside the theatre as homosexual is a natural one for many in the West today, and that many of these performers were indeed engaging in male/male relationships is probable. However, Anglo-American culture lacks a common understanding of when and why labels of sexuality are applied in Japanese culture. Whilst alternative sexual practices in Japan today, including those with long historical traditions such as homosexuality, cross-dressing, and transvestism are not widely publicly accepted, there has consistently been a large gap between how one is expected to behave in the outside world, and how that same person may act while taking on the role of entertainer. In a country where standing out from the crowd in any way is usually thought of as socially undesirable, anything that occurs within a framework of fiction – from kabuki and opera to anime and mainstream television – is not considered to be an accurate reflection of either an individual or society as a whole. Consequently, a transvestite who appears in a film is seen as a performer rather than a demonstration of an individual’s real expression of sexuality, and a drag queen depicted on a show lives only in the land of television – a world from which most Japanese feel detached (Buckley, 2002, p. 94).

For example, whilst the televised portrayal of a character named Hard Gay, as played by comedian Masaki Sumitani, depicted a man dressed in a black PVC fetish outfit who ran around the streets of Japan performing acts of charity for unsuspecting bystanders, the show gained national attention and popularity, and was deemed suitable to air on a Saturday evening variety show. Of course, Hard Gay is an overt homosexual parody (in reality neither gay nor a fetishist), and not a bishounen by any stretch of the imagination; his television persona serves to illustrate the ambiguity between screen and reality. In contrast, Japanese stage and film actor Saotome Taichi is a modern example of a figure that embodies the bishounen aesthetic, yet is not spurned or ridiculed for how he dresses, speaks, or behaves during his performances. Well known for playing both beautiful young men and women, Saotome was trained from a young age in the field of female impersonation. An official fan club was established in 2006, and tickets to his kimono dance performance at the Taishokan theatre the following year sold out within a day.

Figure 2: Japanese stage and film actor Saotome Taichi

Nonetheless, the real-life depiction of the bishounen dates back much further than popular Japanese theatre, and can be traced to the tenth century where the Imperial Court of Heian-kyo (now the city of Kyoto) held sway. The Heian Court was the centre of aesthetic sensibilities of all varieties: Japanese music, poetry, calligraphy, and clothing fashions all found their deepest roots here, where aristocrats were obsessed with the pursuit of beauty. It was not simply that cultivating beauty meant a person was sophisticated or fashionable – it also implied a sense of morality. George Sansom, a pre-modern Japan historian, writes: ‘The most striking feature of the aristocratic society of the Heian capital was its aesthetic quality … even in its emptiest follies, it was moved by considerations of refinement and governed by a rule of taste’ (Sansom, 1958, p. 178).

Standards of aristocratic male beauty here were in many ways similar to those for female beauty. Both sexes whitened their skin with rice powder, blackened their teeth using a liquid made up of acetic acid and dissolved iron, and prized a rounded, plump figure in order to physically display the leisure and riches that the peasantry – those with leaner figures from less food intake and darker skin from labouring outdoors – could not afford to obtain. It was fashionable for men to have a thin moustache or tuft of beard at the chin, but large quantities of facial hair were considered especially unattractive (Topics in Japanese Cultural History).

Naturally, Heian beauty is interpreted in a more contemporary, bishounen-esque framework as far as anime and manga are concerned: The Tale of Genji, originally written by Murasaki Shikibu during the Heian period, has had several adaptations, the most recent of which was an 11-episode anime series in 2009. While most of the characters have the creamy white skin of the Heian-period principles of beauty, there is physically little else to tie the anime and Heian ideals of attractiveness together. Genji, with his slender silhouette and narrow features, has nothing that sets him visually apart from any other bishounen that might be seen in any other mainstream anime production, historically-themed or otherwise.

While the bishounen ideal may have been cemented in the Heian era, a quick survey of Japanese popular culture today, even disregarding the anime and manga industries, reveals that far from being a storytale figure, the bishounen also exists as a true world representation. Host club workers, although a far cry from what has been depicted in the extremely popular Ouran High School Host Club anime series, are perhaps the most obvious example to draw from, with many of these young men looking almost like a parody of anime bishounen caricatures. Similar to their hostess club counterparts, where male customers pay for the attentive company of beautiful young ladies, host clubs employ men who are paid to converse, pour drinks, light cigarettes, entertain by means of fun stage performances, and generally flirt with their female clientele. Upon a first visit to a host club, the customer is presented with a menu of each host on offer for her to decide which host she like to meet first. Once she has chosen the host she most prefers, she designates him her named host, with the employee then receiving a percentage of all future sales generated by that particular customer. Most clubs operate on a permanent nomination system, and a host cannot be changed once they have been nominated excepting under special circumstances. Regular payment is determined by a host’s commission on drink sales, and for this reason, the environment can be highly competitive, with tens of thousands of dollars sometimes offered to the host who can achieve the highest sales (pripix).

The typical host look, made up of an appropriately dishevelled dark suit and collared shirt, bleached hair, and expensive silver jewellery, is paired with a stage name often taken from a favourite film, manga, or historical figure that may describe their persona. The overall effect is usually one of an anime bishounen made flesh and blood. The Great Happiness Space: Tale of an Osaka Love Thief (Jake Clennell, 2006), a documentary-film interviewing several hosts and their customers in a popular club in Osaka, paints a very pragmatic picture of the host club industry – one which survives by seducing customers without having to depend on the more overt sexual appeal of strip clubs or brothels. ‘For girls, we are products’, states one such worker. ‘If she wants a humble, cool guy, I can be like that. If she wants a funny guy, I can be like that too.’ ‘Let’s say I do fuck her. That girl will probably never come back’, another points out. ‘At that point, there won’t be anything else I can give her … Knowing how to give them satisfaction without sex – that’s the point.’ While hosts can and sometimes do have sex with their clients, this is clearly not the purpose of this institution’s existence.

Tajima Yoko, a professor of women’s studies at Hosei University in Tokyo, explains the host club phenomenon by the conventional Japanese male and his lack of true listening to the everyday concerns of his partner. ‘Men, married or not, in our culture do not listen to their female partners’ problems carefully … They only tell women what they want them to hear. Men don’t consider women equal partners’ (New York Times). Although there is no official count of the number of individual clubs, the host club industry employed an estimated twenty thousand men in total as of 2005 (Japan – Facts and Details). Some of the larger and more commercial clubs, such as New Ai (New Love) in Shinjuku, Tokyo, employ approximately eighty workers whose sole job it is to fulfil the emotional needs of the women who frequent the club, in part by existing as beautiful objects of fantasy (New York Times).

Such a concept is alien to most of the rest of the world, suggesting that outside of Japan, the majority of countries do not have the numbers of the right type of customer – that is, one willing to spend several hundred or thousand dollars per visit purely to be kept company by a score of pretty men – to support such an industry. While the idea of paying for sex is universally understood, the thought of paying an equal amount or more for the pleasure of someone’s company is simply baffling to many people. In turn, although some host workers are foreigners, host clubs are generally not known about, or else poorly understood, by overseas visitors, and very few customers are non-Japanese. As with the Japanese sex industry, there is a very distinct preference for both the customer and the host to be Japanese, and it is not uncommon for many bars and clubs to have signs outside saying “No foreigners admitted.”

Tellingly, a great deal of the reactions by foreigners to the business of host clubs has tended to be negative. ‘Even if they had equivalent in the UK I don’t think I’d go’, reads one response to an online article. ‘British guys (sorry to say this) don’t really seem to maintain their looks or interest in a womans [sic] needs for long enough.’ ‘I don’t find them attractive in any way, and I don’t want to pay for the “companionship”’, states another (UK Fashion, Lifestyle & Beauty Blog). Keywords commonly associated with host fashion outside of Japan include “tacky”, “fake”, “creepy”, and “sleazy”.

However, in other entertainment industries, the cultural crossover in terms of what women find attractive in a male is more evident. The music business is one such industry, and in Japan, the most extreme form of the bishounen can be found here. Visual kei – literally visual style – is a uniquely Japanese aesthetic music movement inspired by Western glam metal bands such as Kiss and Twisted Sister (High Music XRD). What these bands inspired in visual kei was, as the name of the movement implies, the importance of appearance as an essential part of the musical style, sometimes even above the music itself in terms of importance.

Bands including The Gazette, Versailles, and Alice Nine are today known less for their music and more for their eye-catching make-up and wardrobe in some circles. The visual kei look is ethereally dark, glamorously androgynous, and elaborately punk, often all at once. Many artists are particularly effeminate in appearance, and it is not uncommon for some to pose explicitly as females, wearing dresses reminiscent of Regency, Rococo, and Victorian fashion. Their ‘maleness’ as we might understand it comes across strongly in their vocals, which are usually anything but gender ambiguous.

Figure 3: Visual kei band Versailles

First emerging in the late 1980s, the visual kei movement was pioneered by acts such as X Japan, Buck-Tick, and D’erlanger. By the mid-1990s, a boost in popularity throughout Japan meant that the most notable of these bands were achieving high commercial success, with the likes of X Japan, Luna Sea, Glay, and Malice Mizer receiving large amounts of media attention. This last group became especially famous for their live performances, which featured lavish historical costumes and stage sets. Mana, co-founder of Malice Mizer, would go on to create his own clothing label, Moi-même-Moitié, in 1999, coining the terms “Elegant Gothic Lolita” and “Elegant Gothic Aristocrat” (Steele and Park, 2008, p. 54). He is regularly featured modelling his own designs in the quarterly Gothic & Lolita Bible, the top publication of the Lolita fashion scene, yet retains his mysterious persona by rarely speaking in public. In most interviews past and present, Mana is known for whispering his answers into the ear of a band member or confidante, using Yes/No cards, or expressing himself in mime.

Gackt, who abruptly left Malice Mizer at the height of the band’s success in 1999, began pursuing a career as an actor and solo artist, and is currently one of Japan’s best known pop idols. Since his time apart from Malice Mizer, Gackt has been making regular alterations to his style: his hair has morphed from straight, long, and jet-black to blonde and spiky in the blink of an eye, and he has experimented with nearly every shade of red and brown in between. His naturally brown eyes frequently change colour thanks to habitual use of green or blue contact lenses. Yet whether he plays a gang leader (Moon Child), samurai warrior (Bunraku), or feudal warlord (Fūrin Kazan), the main trademarks of Gackt’s appearance has remained the same – pale, slender, and virtually ageless.

It is therefore no surprise that Gackt has styled himself on, and even provided a model for many bishounen of the manga, anime, and video game industries. Characters from The Rose of Versailles, Rurouni Kenshin, and Final Fantasy VII, among other titles, have all been incorporated into his look at one stage or another, to Gackt’s rising popularity. Sentiments like those referenced by anthropologist Laura Miller in Beauty Up towards stars such as Gackt (‘He has a body so beautiful it’s like an art object … I’m filled with fantasies of the excitement that would happen if we were in bed’) are far from unusual among fans (Miller, 2006, p. 156). Strictly speaking, the cut-off age for bishounen-hood is eighteen, at which point one becomes a biseinen instead – a beautiful man, usually described as more handsome than pretty. Now in his forties, Gackt, still flaunting the cool delicacy of his features, is living proof that age is not necessarily a barrier to adhering to the bishounen style.

Neither are Gackt’s charms restricted to a female fanbase. In 2010, Gackt announced that his live performance at Club Citta in Kawasaki, Kanagawa, would be for men only, reportedly in an attempt to reverse the recent trend among Japanese males of shunning traditional male stereotypes to get in touch with their feminine side, and instead celebrate ‘the way of the man’ (Japan Today: Japan News and Discussion). Over one thousand men attended the sold-out show, while sixty women listened from the lobby and countless others from outside, cheering the men on as they entered (Ningin).

Other Japanese musical stars have found their fame through group collaboration. While the boy band fad in the West has died down somewhat since the 1990s, pop boy bands in Japan are among the most successful of all genres of Japanese music. J-pop found its way into major mainstream success during the same decade, gaining a commercial peak with individual female artists such as Hamasaki Ayumi and Utada Hikaru as well as with idol units (popular singing and dancing groups), many of them all-male. In particular, the talent agency Johnny & Associates, which exists exclusively to train and promote male idol groups, produced several extremely high-profile groups during this period such as SMAP, Tokio, and Arashi. American boy bands such as The Backstreet Boys and ‘N Sync likewise debuted in the 1990s, only to peak and then either disband or sharply decline in popularity in the first years of the new millennium. In comparison, Japanese boy bands continued to grow in number and reputation, with huge acts like EXILE, NEWS, KAT-TUN, and Hey! Say! JUMP joining the idol unit craze.

Like the Western boy bands of the 1990s, aesthetic appeal continues to be a significant factor in the popularity and marketing of these all-boy Japanese groups, and it is easy to see the similarities between The Backstreets Boys and Hey! Say! JUMP, Westlife and KAT-TUN, or ‘N Sync and Arashi not only in terms of sound, but also in general style. Posters, album covers, and promotional photos depict these bands casually standing or lounging about dressed all in white, for example, as they gaze coolly at the camera. Other images show the band members in jeans and black leather jackets, long coats with scarves draped nonchalantly about their necks, or with the slightly ruffled suit-and-tie look.

Figure 4: Boy bands N Sync and Arashi

However, looking past some of these blinding similarities, there are also some significant differences. For instance, it is difficult to find members of any of these household-name Japanese boy bands with facial hair, while there usually seems to be at least one, and sometimes two or three Western boy band members sporting a well-groomed beard or goatee. The same contrast can be seen with regards to boy band members with piercings or tattoos; the resident ‘bad boy’ of the group is usually evident in a Western boy band, while that figure is conspicuous only by his absence in the Japanese version. Hair tends to be a little longer in Japanese male idol groups, with a particular emphasis on eye-covering fringes and painstakingly placed wisps, whereas only one or two boy band members out of any given group in America might be known for their longer locks.

Overall, Japan’s male groups are typically gentle in appearance, perhaps a little more friendly and accessible. Whilst not precisely androgynous, they are a less extreme version of the bishounen of the visual kei scene. Where The Backstreet Boys and ‘N Sync took pains to keep from appearing too pretty by balancing out the more delicate-looking members of the group with a couple of tougher, more traditionally masculine individuals (and presumably thereby avoiding any gay slurs), the members of KAT-TUN and Hey! Say! JUMP make their living off being beautiful.

Furthermore, groups like this earn their idol status not only by singing, but also by acting in television dramas, appearing on variety shows, hosting charity events, and endorsing products such as Coca-Cola, KDDI Corporation mobile phones, Wii video games, and the Japan Tourism Agency. They are a constant, inescapable presence in nearly all aspects of Japanese daily life, and they take pains to form an image based on their individual talents or personality traits as being a part of a cohesive unit. Predictably, their female fans are both numerous and extremely passionate. In 2010, a Tokyo-based freelance journalist wrote:

Then there is Arashi who celebrate each other’s birthdays and vacation together. It seems incredible that Arashi is popular worldwide for simply being good buddies but this kind of interaction is so rarely seen in celebrities. In Japan, the interaction is rehearsed and simulated. Overseas, variety shows specialize in people openly feuding. With these types of entertainment, it is no wonder that the good-natured humor of Arashi, along with their sappy sweet pop songs, is healing the world’ (The Asahi Shimbun Digital).

The young men of Arashi may be a little too conventional to indulge in cosplay or model their looks after specific anime characters, but their style cannot help but be at least indirectly influenced by the bishounen aesthetic. Very few anime bishounen have any trace of facial hair, and as has been previously discussed, these types of characters are well-known for their slender frames, unblemished skin, glossy hair that falls just so over the eyes, and a coolly tantalizing aura. The similarities are not hard to overlook.

It is apparent that there has been some crossover of the appeal of the bishounen in today’s Western entertainment industries with newer boy bands such as British group One Direction, although the most notable increase with regards to the popularity of pretty boys has been seen in the film industry thanks to the widespread popularity of franchises like Twilight. Although most non-Japanese teenage girls may not know the meaning of the word bishounen or have any understanding of what anime or manga is, the traditional sex appeal of the rough, tough, rugby-player style body currently competes against the slim, milky-white skinned young male as so obviously embodied in the character of Edward Cullen.

The film versions of the Twilight novels chose to amplify the tension already seen in book format, where two young men compete for one girl’s romantic interests and the heroine constantly bounces between the two, who are physically complete opposites. Edward is exceptionally slim, pale to the point of being sickly-looking, and has an aura of cold intensity about him even after becoming romantically involved with Bella. The very name Edward, which roughly translates to ‘wealthy guard’, conjures images of English nobility and old world romanticism. Conversely, Jacob Black is of Native American descent, and has dark hair and eyes and russet skin. A tribal tattoo on his right arm completes the slightly roguish look. Although he is originally described as tall and lanky in the first book, the films portray him as relatively muscular; a fact that is only accentuated by his usual style of clothing – or lack thereof. Where Edward is cool, Jacob is passionate and adventurous, and where Edward turns into a sparkly, ostensibly prettier version of himself, Jacob quite literally transforms into a wild animal.

As was no doubt the intention, the competition between Edward and Jacob transcended the screen and became embedded in popular culture. Did main protagonist Bella – and by extension, the audience – lust after the beautiful, sharp-edged Edward, or did she prefer the brawny, more earthy charms of Jacob? Did fans desire pretty, or lust over handsome? Posters, shirts, and an array of other types of merchandise proudly display an allegiance of either Team Jacob or Team Edward, and have been snapped up by teens and tweens in their thousands.

Ultimately, however, it was Team Pretty who won the race, winning not only the girl but also, in overwhelming numbers, the most fans. In a poll carried out in 2008 by Novel Novice Twilight, a website dedicated to exploring the relationship between the Twilight series and its fans, Team Edward won by nearly double the score, earning over five thousand votes (Twilight Novel Novice). In 2009, Robert Pattinson was chased into traffic on a New York City street by a mob of frenzied fans, and the following year, People magazine listed Pattinson in their “World’s Most Beautiful” issue because of his ‘pale, otherworldly complexion’ (New York Daily News, People).

Unsurprisingly, the amount of anime bishounen who also fit the unearthly beautiful vampire mould are numerous: Zero, Kaname, and just about every other male vampire from Vampire Knight, Solomon Goldsmith and Hagi from Blood+, Shido and Cain from Nightwalker: Midnight Detective, and Trinity’s Blood’s Abel and Cain Nightroad, to name just a few. This is not to suggest that Stephanie Meyer was directly influenced by anime or the figure of the Japanese bishounen, but rather that due to the current influence of anime in international popular culture, non-Japanese audiences are becoming more receptive to the pretty boy as one ideal of male beauty.

Figure 5: Trunks from popular male-orientated anime Dragonball Z

However, although the sheer popularity of characters such as Edward Cullen would appear to indicate that the West is becoming more open to pretty young men being an acceptable form of heterosexual attractiveness, it also illustrates that we are far from being able to think of prettiness as a form of real masculinity. The hate that has been directed towards Patterson/Edward Cullen suggests that traditional notions of masculinity have not been eclipsed, and there is a substantial reaction against bishounen-type characters in the West despite the undoubted popularity of the figure amongst teenage girls. While we usually insist on polarising prettiness and masculinity, in Japan this does not seem to be an issue. Japanese studies Professor Kenneth G. Henshall points out that ‘deliberately enhanced “effeminate”, flower-like, graceful beauty has rarely been considered the antithesis of manliness in Japan, either by women or men themselves’ (Henshall, 1999, p. 4). Mainstream notions in Anglo-American society that correlate being pretty and being homosexual are gradually changing, yet slurs such as “pretty boy”, “queen”, and “fairy” are still commonly applied to men who are perceived as being too feminine in appearance, or who are especially fastidious about their physical presentation – regardless of whether this has any kind of connection to sexual orientation. In contrast, Miller has written, ‘I do not see current male beauty practices [in Japan] as a type of “feminization” of men … but rather as a shift to beautification as a component of masculinity’ (Miller, 2006, p. 126).

Given that manga, while rising in awareness and popularity in America and elsewhere, is nowhere near approaching the types of sales figures in Japan, this should not come as a surprise. Manga makes up nearly forty percent of total book sales in Japan, and to a large extent is responsible for normalising the bishounen aesthetic (Craig, 2000, p. 110). The bishounen has been a central figure for much of manga and anime’s modern history and is not limited to genre or demographic, even appearing in titles aimed at a primarily male audience such as Trunks from Dragonball Z, Sesshomaru from InuYasha, and Sasuke and Itachi from Naruto, as well as the conventional female-orientated fare. However, the influence of the young female consumer in Japan cannot be underestimated, and much of the entertainment industry caters to her tastes and desires. Whilst the stories depicted in anime and manga may not be a direct reflection of Japanese society, the prevalence of the bishounen has undeniably gone a long way in giving society the okay to emulate the look without being frowned upon or ridiculed for it.

Perceptions are gradually shifting in the West as well, and in many cases it appears that pretty is becoming the new brand of sexy for men. However, without the same sort of normalisation that Japan enjoys, it is doubtful whether the gap between male beauty and stereotypes of weakness and femininity will be bridged to the same extent in the near future. The bishounen outside of Asia is beginning to gain some currency, but ‘safe’ Western notions of duality – masculine and feminine, heterosexual and homosexual – may be too ingrained to ever be disregarded completely.

Notes


i Both yaoi and boys love are popular terms used to describe fictional media (usually referring specifically to anime and manga), that focuses on homoerotic male relationships. The genre is largely created by and for a heterosexual female audience, and is distinguishable from what is commonly known as gei comi, bara, or mens love, which caters to a gay male audience and tends to be created primarily by homosexual male artists.

ii Eventually, still finding similar problems, all male actors became required by the authorities to shave their hair in the style of mature men so that they would be less attractive to their audience.

 

References

Buckley, Sandra (2002). Encyclopedia of Contemporary Japanese Culture, Taylor & Francis, London.

Bullough, Bonnie (1993). Cross Dressing, Sex, and Gender, University of Pennsylvania Press, Pennsylvania.

Craig, Timothy J. (2000). Japan Pop!: Inside the World of Japanese Popular Culture,  M. E. Sharpe, New York.

Henshall, Kenneth G. (1999). Dimensions of Japanese Society: Gender, Margins and Mainstream, Palgrave Macmillan, New York.

Miller, Laura, (2006). Beauty Up: Exploring Contemporary Japanese Body Aesthetics, University of California Press, California.

Pflugfelder, Gregory M. (1999). Cartographies of Desire: Male-Male Sexuality in Japanese Discourse, 1600-1950, University of California Press, California.

Sansom, George Bailey (1958). A History of Japan to 1334, Stanford University Press, California).

Scott, Adolphe Clarence (1999). The Kabuki Theatre of Japan, Courier Dover Publications, New York.

Steele, Valerie and Park, Jennifer (2008). Gothic: Dark Glamour, Yale University Press, Connecticut.

The Great Happiness Space: Tale of an Osaka Love Thief. 2006. Directed by Jake Clennell. United Kingdom. Jake Clennell Productions.

High Music XRD. History of Visual. http://www.xrdnet.com/xjapan/x_visual_e.php (accessed 26 March 2012).

Japan – Facts and Details. HOSTESSES, HOSTS AND STRIPTEASE IN JAPAN. at, http://factsanddetails.com/japan.php?itemid=673&catid=19&subcatid=127 (accessed 26 March 2012).

Japan Today: Japan News and Discussion. Gackt to hold concert just for men. http://www.japantoday.com/category/entertainment/view/gackt-to-hold-concert-just-for-men (accessed 26 March 2012).

New York Daily News. Robert Pattinson hit by a taxi while running away from fans. http://articles.nydailynews.com/2009-06-18/gossip/17925942_1_robert-pattinson-rob-pattinson-filming (accessed 26 March 2012).

New York Times. Clubs Where, for a Price, Japanese Men Are Nice to Women. http://www.nytimes.com/1996/09/08/style/clubs-where-for-a-price-japanese-men-are-nice-to-women.html (accessed 26 March 2012).

Ningin. Gackt’s men-only show SELLS OUT! http://blog.ningin.com/2010/03/22/gackts-men-only-show-sells-out/ (accessed 26 March 2012).

People. World’s Most Beautiful 2010! – Robert Pattinson. http://www.people.com/people/package/gallery/0,,20360857_20364406,00.html#20776540 (accessed 26 March 2012).

Pripix. Tokyo Hosts. http://www.pripix.com/features/hosts.htm (accessed 26 March 2012).

The Asahi Shimbun Digital. Arashi are more than just pretty boys. http://www.asahi.com/english/TKY201010260400.html (accessed 26 March 2012).

Topics in Japanese Cultural History. The Heian Period Aristocrats. http://www.personal.psu.edu/faculty/g/j/gjs4/textbooks/172/ch3.htm (accessed 26 March 2012).

Twilight Novel Novice. Team Edward VS Team Jacob. http://twilightnovelnovice.com/specials/contests-projects/novelnoviceprojects/team-edward-v-team-jacob/ (accessed 26 March 2012).

UK Fashion, Lifestyle & Beauty Blog. A look into Host Clubs. http://blooomzy.blogspot.com/2010/04/look-into-host-clubs.html (accessed 26 March 2012).

Bio:

Christy Gibbs is a graduate from the University of Waikato in New Zealand, and has recently completed her doctoral thesis whose topic explores representations of sexuality in contemporary Japanese animation. She is currently working in rural Japan as an Assistant Language Teacher and is also a regular columnist for Forces of Geek, a blog focusing on a variety of pop and geek culture worldwide.

 

The Single Female Intruder – David Surman

Abstract: This essay examines a contemporary cultural icon that operates across distinct media boundaries, as a kind of transmedia archetype. Of interest is the visuality of what I call the ‘single female intruder’, which emerges as the intersection of a variety of low cultural forms, and has its origins in the Japanese visual and literary culture of the nineteenth century. What are the characteristics of the single female intruder? She wears closely fitted clothing, which describe the shape of her body, though she is tall, willowy and androgynous. She comes equipped with a variety of powerful weapons and technologies, that she keeps secreted away on her person, and combines this armoury with expert knowledge of a variety of relevant disciplines. She is always proficient in martial arts, though her willingness to fight is measured against the dramas of her past, tempering the speed of her sword-hand. Her movement is characterised by an impossible elegance, and she seems preternaturally adapted to exploit any space that she comes to occupy. The technologies she deploys are an extension of the physical body, and never encumber her.

Figure 1: Vanessa Z. Schneider in the videogame P.N.03 (2003)

Introduction

Within the generic realities of film, animation, games and comic books, there are many varied female archetypes. Indeed, the representation of women in the media inevitably segues into the active discussion of typologies. The distribution of such types fall within the predefined boundaries of high and low, popular and peripheral, men’s and women’s culture. The effect and ideology of certain types has been actively debated in the humanities, and in particular in feminist criticism. Tanya Krzywinska has outlined the way in which cultural analyses of action heroines has orientated toward the critique of such icons as role models, within the frame of identity politics (Krzywinska, 2005, p. 3). In her critique of action heroines within videogames, she suggests that the critique of representation is limited insofar as it fails to describe the dimensions of play and control that underpin the videogame experience.

This essay examines a contemporary cultural icon that operates across distinct media boundaries, as a kind of transmedia archetype. Of interest is the visuality of what I call the ‘single female intruder’, which emerges as the intersection of a variety of low cultural forms, and has its origins in the Japanese visual and literary culture of the nineteenth century. With the ‘recentering’ of globalised media from its traditional North American power-base toward new Asian counterparts (that has come as a consequence of sustained growth in Japan’s media and cultural industries), such icons have been disseminated to receptive western audiences. The characteristics of the single female intruder are defined as a consequence of the media that converge to form the transmedia space of contemporary popular culture. Their positioning as low cultural forms unifies the constituent fields that converge in the figure of the ‘single female intruder’.

What are the characteristics of the single female intruder? She wears closely fitted clothing, which describe the shape of her body, though she is tall, willowy and androgynous. She comes equipped with a variety of powerful weapons and technologies, that she keeps secreted away on her person, and combines this armoury with expert knowledge of a variety of relevant disciplines. These will usually include computer programming, reconnaissance, research and investigation. She is always proficient in martial arts, though her willingness to fight is measured against the dramas of her past, tempering the speed of her sword-hand. Her movement is characterised by an impossible elegance, and she seems preternaturally adapted to exploit any space that she comes to occupy. The technologies she deploys are an extension of the physical body, and never encumber her.

She is an amalgam of high trash clichés and narrative conceits; often orphaned, wracked by bereavement, seeking vengeance, driven by the urgency of an incurable illness. Such melodramatic tropes are buried beneath the sobriety and perfection of grey-white skin, expressionless and captivating. She is two people in one body; the face of an angel, the heart of a demon; but never duplicitous, her expressions of emotion are sincere and forthright, often taking place in secluded confessionals away from the song of carnage. She is never the homemaker, though the riddle of such happiness might emerge in moments of reprieve. She is a nomad, constantly on the move, often moving out of the frying pan and into the fire. She is more a heroine of generic reality than everyday life, a celebration of the seductive tropes of contemporary fiction and the intermingling of technology, imagination and desire.

The single female intruder is so ubiquitous in contemporary popular culture that an examination of her sophisticated rhetoric is necessary. In the course of this article, I want to show how such an internationalised, post-modern archetype, which seemingly operates outside of any clearly defined cultural boundaries, has origins in pre-modern Japanese culture. I shall argue that the history of this archetype can be seen as metonymic of the changing post-war relationship between American hegemony and the rise of Japanese popular culture as a new global centre. The proliferation of this archetype follows a very particular path, and its movement can be traced from aesthetic reforms in Japanese antiquity, subsequently retrieved in the 1970s by filmmakers and mangaka eager to revisit the culture of the Edo period. Hiroki Azuma has described how this internal appropriation of Edo period aesthetic and cultural values comes as a consequence of the cultural anxieties arising as a response to wartime defeat and American occupation. He writes,

Their preference toward the association between the 80s postmodern society and the premodern Edo can be easily explained once you recognize the abovementioned process of “domestication” of the postwar American culture. In the mid 80s, many Japanese were fascinated with their economical success and tried to erase or forget their traumatic memory of the defeat in World War Two. The re-evaluation of Edo culture is socially required in such an atmosphere (Azuma, 2001, np).

As I will explain, the tropes of ‘rikyu grey aesthetics’ and ‘the poison woman’ are retrieved and then celebrated within the generic reality of Japanese popular culture from the 1970s onwards. The ambiguous, seductive and controversial qualities of this historical figure consequently circulate within the growing international fandom for Japanese popular culture. From there, contemporary influences imbibe this peculiarly Japanese anti-heroine with a new agency, to embody principles of control and beauty in an age of technological anonymity and information terrorism. Influences that immediately spring to mind include videogames, action cinema, exploitation cinema, science fiction literature, in particular cyberpunk, fetish clothing and the goth, techno and electronic music scenes. Contemporary single female intruders reveal the traces of their Japanese antecedents in their sober demeanour, snow-white skin and mobile technologies. Like the massively successful franchises Pokemon and Yu-Gi-Oh! The single female intruder is an ambassador for an alternative set of generic parameters in popular culture that assert the Japanese aesthetic, and is resolved in the interaction of multiple cultural centres.

In the first section of this paper, I will explore the Japanese antecedents to the single female intruder, with an emphasis on the relationship between simultaneous reforms in attitude to both colour and femininity. From there, I examine how Japanese film and literature of the mid-to-late twentieth century transformed this figure into a modern heroine first through exploitation, and then science fiction. I then want to examine briefly the transformation of this figure in the science fiction film and literature of 1980s America and Europe. The representations and descriptions generated by the likes of Ridley Scott and William Gibson play a central role in Japan’s imagining of itself and its iconography. To conclude, I examine how digital culture and convergence have effected the transformation of the single female intruder, and how her sophisticated rhetoric has been transformed to speak to our contemporary environment.

Poison Woman Dressed in Rikyu Grey

Figure 2: Hishikawa Moronobu “A Standing Woman”, c.1690.

The prehistory of the single female intruder archetype is much more culturally specific than it might first seem, since such characters nowadays enjoy an international audience. The archetype emerges from the changes in the construction of cultural attitudes to beauty and femininity around the time of the Meiji reformation of Japan. Single female intruders are invariably rebels, whether they are escaping societal reforms, in the case of Trinity in The Matrix trilogy (1999; 2003; 2003) or the eponymous Aeon Flux (2005), complex mercenaries like Vanessa Z. Schneider (fig.1) in the videogame P.N.03 (2003), or living technologies driven by existential angst like Major Makoto Kusanagi of Ghost in the Shell (1995).

Christine L. Marran has described the origins of what she has coined the ‘Poison Woman’, in stories made popular during the Meiji reformation (1868–1912) of the nineteenth century. They profile the lives of sensational women who had caused some sort of scandal, more often than not though the murder of her spouse, perhaps guilty of involvement in other high profile vices. She writes,

The long and changing tradition of writing about female criminals began with the rise of the newspaper serial. With such colourful nicknames as Demon Oden, Night Storm Okinu, Viper Omasa, and Lightning Oshin, to name only a few, the first poison women appeared as anti-heroes in Japan’s earliest serialized newspaper stories. These serials were based on the lives and crimes of real women. (Marran, 2007, p. xv)

The media furor around the activities of female criminals far exceeded the number and frequency of their activities, such was the public appetite for this new sensational fiction. Fiction and reality intermingled from the outset. As Marran asks ‘What national obsessions are articulated through this interest in the female convicts?’ (Ibid.). The rise of the poison woman archetype in Meiji period culture coincides with substantial changes in the representation of women in the woodblock prints of ukiyo-e artists. These changes would complicate the rhetoric surrounding such controversial women. In the Genroku era (1688–1704) the artist Hishikawa Moronobu,(1618–1694) was one of the pioneers of the ukiyo-e printmaking craft, and was known for his portraits of women and lifestyle scenes. In his imagery the women are voluptuous and feminine, shown in brightly coloured, voluminous robes (fig.2). In the later An’ei-Tenmei era (1772–1781; 1781–1789) the work of artist Suzuki Harunobu (1724–1770) departs from this archetypal, highly feminised aesthetic, and instead portrays women with long, slender bodies, demure faces and a spiritual intensity (fig.3). Kisho Kurokawa writes that,

This trend is of particular interest because it suggests the progressive denial of the generous voluptuousness that symbolized the prosperity and material abundance of pre-modern Japan up until Genroku. The An’ei/Tenmei aesthetic, on the other hand, was characterised by a nonsensual, eccentric, and non-physical beauty, expressing the spirit of an age of more refined ambiguity and a sophisticated rhetoric. (Kurokawa, 1997, p. 161)

Figure 3: Suzuki Harunobu “Crow and Heron, or Young Lovers Walking Together under an Umbrella in a Snowstorm”, c. 1769

 

This new aesthetic of ambiguity, which pervades Harunobu’s prints, becomes the face of the poison woman. Her crimes and misdemeanours are complicated and intensified by the aesthetic coding of this new feminine rhetoric. Marius B. Jensen writes of these ukiyo-e prints that, ‘The ladies they portray are not full faced, something the carver could not provide, but minimalist sketches; they return our stares unblinking and uninvolved. We admire them but do not relate to them, somewhat the way Saikaku’s readers regarded his characters’ (Jensen, 2002, 180). Earlier trends in popular aesthetics inform the recurrent representation of the poison woman in ukiyo-e printworks and in newspaper stories of the period. In the period preceding the Genroku era, a sudden fashion for the colour grey emerged in Japanese society, as a result of the cultural reforms to the tea ceremony introduced by Sen no Rikyu (1522–1591). Jensen writes, ‘Sen no Rikyu, who served as chief tea master to both Nobunaga and Hideyoshi […] was a figure who combined considerable personal wealth with a cult of simplicity and modesty that he codified in the tea ceremony of his day’ (Jensen, 2002, 117). Part of this revision of the ceremony was the advocation of the colour grey in clothing and décor. Kurokawa confirms the connection between tea ceremony reforms and the emerging taste for minimalism and grey,

Whereas until this time grey had been considered a vile colour conjuring up the image of rats and ashes, upon becoming known as Rikyu grey it was better appreciated. In the mid-Edo era it gained tremendous popularity—along with brown and indigo—as the embodiment of the aesthetic ideal of iki. Iki in this period is a complex concept but may be conveniently described as “richness in sobriety.” As the cult of tea spread beyond the upper classes to be practiced in the homes of ordinary people, so did the taste for grey. (Kurokawa, 1997, p. 160)

In his rehabilitation of Rikyu grey as an aesthetic category in its own right, Kurokawa emphasises the colour’s essential ambiguity, at times sinister, charming and charismatic. He describes how, ‘In contrast to the grey in the West, which is a combination of black and white, Rikyu grey was a combination of four opposing colours: red, blue, yellow and white’ (Kurokawa, 1991, p. 70). And so, the construction of the ‘poison woman’ in Meiji period mass culture intersects with two crucial aesthetic reforms, the adoption of Harunobu’s slender, ambiguous figure in the representation of women, and the rise of the widespread fashion for Rikyu grey, which emerged from reforms to the tea ceremony which emphasised simplicity, austerity and sobriety.

The Blizzard from the Netherworld

Figure 4: Yuki in Lady Snowblood (Shurayukihime, Toshiya Fujita, 1973)

I want to make a leap now to postwar Japan, where the domestic influence of American occupation was having an effect on popular culture. Tensions arising from wartime defeat, aggressive industrialisation and urbanisation and a sense of cultural dissipation motivated media producers to rehabilitate narratives and character archetypes from the Edo period, as a means of cultural recovery and national reflection. The three tropes of the poison woman archetype, Harunobu’s willowy bodies, and the aesthetic sobriety of Rikyu grey are consolidated in Yuki Kashima (fig.4), heroine of the Japanese exploitation film Lady Snowblood (Shurayukihime, Toshiya Fujita, 1973). Fujita’s film, based on the manga by Kazuo Koike, follows the journey of Yuki, played by Meiko Kaji, who seeks bloody vengeance for the rape and murder of her mother and father at the hand of a gang of bandits. She is the quintessential poison woman, and her exploits are publicised in the course of the film by newspaper reporter Ashio Ryuhei. The sophisticated and ambivalent quality of Yuki, and also the actress Meiko Kaji, is captured by Rikke Schubart, who writes,

The star persona of Meiko Kaji is located between the extraordinary powers of a castrating gaze and the existential malaise of a female killer. Kaji’s characters are haunted, if not by the past, but by a sense of not belonging, of being out of place and out of time. In this, they resemble the mythic hero. They are exceptionally beautiful, yet out of reach emotionally. Their weapon skills are at the expense of inner balance. They move faster than any opponent but lose track of life. (Schubert, 2007, p. 119)

The cult appeal of Asian exploitation heroines such as Yuki had the effect of reenergizing the antiquated archetype of the poison woman, along with the sensibility of Rikyu and the aesthetics of Harunobu. Poison women exist in every age, but the sword wielding she- demon of the Edo period had a romantic appeal all of its own. The unsettling and arresting beauty of her skin, and the ghostly perfection of Yuki’s ‘whitewashed-wall weave’ kabe shijira kimono, dominate the mise-en-scène. Suddenly, she breaks her repose to flip into action and attack; fountains of blood arc across the frame, her kimono drips wet, marking her as victorious in auspicious red and white.

Lady Snowblood marks the overlap between the icon of poison woman and what I call the ‘single female intruder’. Concealed within her umbrella, her secret sword is idiosyncratic, and operates within a sophisticated rhetoric that emphasises not only martial power, but also skills in deception, persuasion and elegance. The attraction of the character arises from repeated emphases on sharp contrasts, and this is continuous with the expanded principle of Rikyu offered by Kurokawa. Her subordinate shuffle is broken by sudden and supernatural agility; her sword strikes are unwavering, and land with the spirit of hissho (absolute victory). The vacillation between opposites characterise the single female intruder; she has brutality and elegance, bloodlust and sobriety, movement and stillness in equal measure. Kurokawa connects this principle to the baroque, he writes, ‘In his book on the baroque, Eugenio D’ors states that when conflicting intentions are bound together in a single motion, the resulting style is by definition baroque’ (Kurokawa, 1997, p. 170). Later he adds that, ‘The “baroque” essence to which I refer is represented by the mutual resistance and harmony of weight and drift, stillness and movement, straight and curves lines’ (p. 175).

American Idols

Post-war industrialisation and the rise of commodity culture have placed technology at the centre of the Japanese popular imagination. At the same time as filmmakers like Fujita withdrew into the images of Edo Japan to draw sustenance, others, like manga and anime artist Osamu Tezuka, were thinking forward into imaginary futures, populated by the dream of robot, cyborg and alien life. The ‘single female intruder’ is the recombination of these two sensibilities, at once strongly reminiscent of her Edo counterparts, and also situated within film or gameworlds that are nonetheless ostensibly works of science fiction. She emerges as a coherent iconic figure in the 1980s. The transformation of the poison woman in to the single female intruder takes place in the figure of Molly Millions in William Gibson’s short story Johnny Mnemonic (1981), and in the character of Pris in Ridley Scott’s Blade Runner (1982). Gibson’s lifelong obsession with Japanese culture is evident throughout his literature to date, and traces of the influences of the multifaceted concept of the poison woman are evident. Taken for granted, moreover, is the place of Rikyu grey, both literally as a colour sense, and as a philosophy of ambiguity and contrasts, and the idealism of Harunobu’s slender courtesans. The entrance of Molly Millions echoes that of Yuki in Lady Snowblood. The same emphasis on concealed technology, and a lethal capability, shroud the character in a mist of ambiguity and tightly wound sexuality.

‘Hey,’ said a low voice, feminine, from somewhere behind my right shoulder, ‘you cowboys sure aren’t having too lively a time.’

‘Pack it, bitch,’ Lewis said, his tanned face very still.

Ralfi looked blank.

‘Lighten up. You want to buy some good free base?’

She pulled up a chair and quickly sat before either of them could stop her. She was barely inside my fixed field of vision, a thin girl with mirrored glasses, her dark hair cut in a rough shag. She wore black leather, open over a T-shirt slashed diagonally with stripes of red and black.

‘Eight thou a gram weight.’

Lewis snorted his exasperation and tried to slap her out of the chair. Somehow he didn’t quite connect, and her hand came up and seemed to brush his wrist as it passed. Bright blood sprayed the table. He was clutching his wrist white-knuckle tight, blood trickling from between his fingers.

But hadn’t her hand been empty? (Gibson, 1981, p. 18)

The description of Molly emphasises her stature and costume, and the scene is characterised by an anxious stillness, which breaks into sudden action. Like Yuki’s hidden sword, Molly’s ‘weapons’ aren’t disclosed, but their effect enjoys a glorious description, again reminiscent of the exploitation film aesthetic of bloody carnage found in Lady Snowblood. Later, the secrets of Molly’s fatal frame are laid bare:

‘Chiba. Yeah. See, Molly’s been Chiba, too.’ And she showed me her hands, fingers slightly spread. Her fingers were slender, tapered, very white against the polished burgundy nails. Ten blades snicked straight out from their recesses beneath her nails, each one a narrow, double-edged scalpel in pale blue steel. (p. 21)

Molly’s finger blades are like Yuki’s concealed sword, in that they form a highly personalised accessory crucial to their survival in a world that is largely hostile to them. Through them their bodies become ‘trick machines’ designed to entrap, confuse, and terrorise their opponents. The complex rhetoric of hidden capability runs through the single female intruder, and is most apparent in the gynoid half-machine characters that have appeared since Molly first took to the streets of Chiba.

Transnational Assassins

Figure 5: Beatrix in Kill Bill (Quentin Tarantino, 2003)

Within the generic reality of convergent media culture, the tropes of the single female intruder have folded in on themselves, and, while the poison woman was penned in direct relation to the changes in society, the single female intruder of recent film and game texts is not so motivated to comment on changes in culture. She operates, like Beatrix in Tarantino’s Kill Bill films, within the “movie-world”, that is, within the circular distribution of generic styles, codes and conventions.

While the single female intruder certainly develops, in contemporary digital culture, the aesthetic, form and rhetoric of the femme fatale and other types of female killer (see Schubert, 2007), my interest lies with the long history that underpins her making, and the politics of globalisation she traverses. Her seductive deadly methods evoke fear outside of the textual worlds she inhabits, since she, like the ninja kids of Naruto, is an iconic player in the global media game, and is metonymic of the massive changes taking place in the landscape of media power. Koichi Iwabuchi writes that,

Japan’s hitherto odourless cultural presence in the world has become more recognizably ”Japanese” as computer games and animation from Japan have grabbed large shares of overseas markets. Japan’s success in exporting cultural products that are unmistakably perceived as “Japanese” have evoked a sense of yearning and threat overseas, including fear of cultural invasion (Iwabuchi, 2004, p. 59).

The single female intruder has emerged as the most prominent action heroine type in recent years, with films released that seek to comment on our technologically driven, information culture. Her independent agency, computer expertise and athletic finesse position the single female intruder as a dominant fantasy of control for our time. Connecting body politics, privacy issues, technology and gender relations in the actions of this subtly orientalized superhero, contemporary media producers have created a figure as pertinent to our time as the muscle-bound action hero was to the 1980s. While the ‘high trash’ of summer blockbusters, videogames and exploitation films might suggest that the single female intruder is nothing more and techno-fetish and titillation, I hope to have shown, through an emphasis on her origins in Japanese aesthetics, that such characters are playing an instrumental role in the reorganisation of gendered heroism within transmedial representation.

 

Games

Bullet Witch (Cavia, Inc./Atari, AQ Interactive, 2007)

Final Fantasy 12 (SquareEnix, 2006)

Ghost in the Shell (Exact/THQ, 1998)

Gun Valkyrie (Smilebit/BigBen Interactive, 2002)

Ico (Team Ico/SCE, 2002)

Oni (Bungie Studios/Rockstar Games, 2001)

P.N.03 [Product Number Three] (Capcom Production Studio 4/Capcom, 2003)

Panzer Dragoon Orta (Smilebit/Sega, 2003)

Panzer Dragoon Saga (Team Andromeda/Sega, 1998)

Perfect Dark (Rare/Rare, 2000)

Perfect Dark Zero (Rare/Rare, 2005)

Rez (United Game Artists/Sega, 2001)

Space Channel 5 (United Game Artists/Sega, 2000)

Space Channel 5: Part 2 (United Game Artists/Sega, 2003)

Tenchu: Fatal Shadows [Tenchu: Kurenai] (K2 LLC/Sega, 2005)

Tomb Raider (Core Design/EIDOS, 1996)

Films and Anime

Aeon Flux (Karyn Kusama, 2005)

Aeon Flux [Animated Series] (Peter Chung, 1995)

Bladerunner (Ridley Scott, 1982)

Ghost in the Shell (Mamoru Oshii, 1995)

Ghost in the Shell 2: Innocence (Mamoru Oshii, 2004)

Ghost in the Shell: Stand Alone Complex (Kenji Kamiyama, 2002-2003)

Ghost in the Shell: Stand Alone Complex 2nd Gig (Kenji Kamiyama, 2004-2005)

Shurayukihime [Lady Snowblood: Blizzard from the Netherworld] (Toshiya Fujita, 1973)

Shurayukihime: Urami Renga [Lady Snowblood 2: Love Song of Vengeance] (Toshiya Fujita, 1974)

Sympathy for Lady Vengance [Chinjeolhan Geumjassi] (Chan-wook Park, 2005)

The Matrix (The Wachowski Brothers, 1999)

The Matrix: Reloaded (The Wachowski Brothers, 2003)

The Matrix: Revolutions (The Wachowski Brothers, 2003)

Manga

Kurata, H. Yamada, S. (2000 – present) Read or Die. Tokyo: Shueisha.

Shirow, M. (1989 – 1991) Ghost in the Shell. Tokyo: Kodansha.

References

Azuma, H. (2001). Superflat Japanese modernity, Retrieved [August, 01, 2007] from<http://www.hirokiazuma.com/en/texts/superflat_en1.html>

Gibson, W. (1981) Burning Chrome. London: Voyager.

Iwabuchi, K. (2002). Recentring globalisation: Popular culture and Japanese transnationalism. London: Duke University Press.

Iwabuchi, K. (2004). How Japanese is Pokémon?. In J. Tobin (Ed.), Pikachu’s global adventure: The rise and fall of Pokemon. London: Duke University Press. pp. 53-79.

Jensen, M. B. (2000) The Making of Modern Japan. London: Harvard.

Krzywinska, T. (2005) ‘Demon Girl Power: Regimes of Form and Force in videogames Primal and Buffy the Vampire Slayer’, New Femininities Seminar Series, London, 9th

December.

Kurokawa, K. (1991) Intercultural Architecture: The Philosophy of Symbiosis. Aia Press.

Kurokawa, K. (1997) Each One A Hero: The Philosophy of Symbiosis. London: Kodansha International.

Schubart, R. (2007) Super Bitches and Action Babes. London: MacFarland & Company, Inc.

 

Bio:

David Surman is an artist and designer, based in Melbourne, Australia after migrating from the UK. Over the past 10 years he has worked in many different creative environments, and he is currently creative director and co-founder of Pachinko Pictures, an award-winning boutique design studio based in Melbourne. David has also pursued a career as a scholar and teacher, which has given him many more opportunities and challenges. He developed a pioneering degree programme in games design at Newport School of Art (University of Wales), which focused on the principles and processes of art and design for games; and was Lecturer in Multimedia Design at Swinburne University of Technology. David is currently completing a PhD in videogame aesthetics at Brunel University, and holds a Masters in Film and Television from Warwick University and a Bachelors in Animation from the Newport School of Art, Media and Design.

Digital Intervention: Remixes, Mash Ups and Pixel Pirates – Amanda Trevisanut

The art of remix and mash-ups is a contemporary cultural phenomenon that has been facilitated by the mass availability of digital software. Remix effectively describes the process of taking samples of existing media – for example audio tracks, film and television images – and knitting these samples into a new text. The active and creative use of cultural products by individuals challenges the paradigm of the passive spectator that is the corner-stone of traditional film theory. For instance, in the psychoanalytically based theories of Jean-Louis Baudry (1975), Laura Mulvey (1975) and Christian Metz (1983), the cinematic apparatus has been conceptualized as hegemonic instrument of ideology that interpolates the viewer into the world of the diegesis. The characterization of the spectator as a passive site of cultural and ideological reproduction is mirrored by the legalities of copyright that seek to indemnify the economic rights of the authors and producers of audio-visual media. In Digital Copyright and the Consumer Revolution: Hands off My iPod, legal scholar Matthew Rimmer asserts that, in copyright jurisprudence, the users of audio-visual media are decidedly absent, and that with the advent of digital technology there is an imperative to recognize:

consumers are not just mere ‘culture vultures’, engaged in the mindless, passive, bovine consumption of new artistic forms and technologies. The users of copyright works are engaged in a multitude of activities, including political expression, cultural transformation and technological tinkering. Moreover, the relationship of consumers to the dictates of copyright law is also a complex one, ranging from obedience to resistance and opposition to indifference and ignorance (2007, 13).

Consumers/users/spectators use of digital software to remediate – meaning that they “adopt aspects of prior, established media” (Ruston 2006) – copyright works draws attention to the failure of traditional theoretical and legal paradigms to recognize spectatorship and/or consumption, as a dynamic site of cultural (re)production. The use of digital technology to remix, remediate, re-master, re-imagine and re-member media artifacts into alternative configurations testifies to the interactive engagement of individuals with cultural artifacts by “blurring the boundaries between the real world of the reader/participant and the crafted world of the narrative” (Ruston 2006). The operations and aesthetics of digital technology, of “archives and databases”, ultimately “offer artists a vehicle for commenting on cultural and institutional practices through direct intervention” (Vesna 2000, 155). This essay does not presuppose that the advent of digital technologies have fundamentally altered the ways in which individuals engage with media. Rather, through an examination of Soda_Jerk and Sam Smith’s 2002-2006 film Pixel Pirate II: Attack of the Astro Elvis Video Clone this essay will aim to show how the specific use of digital software to sample and remix audio-visual images testifies to an existing (if largely theoretically neglected) dynamic relationship between individuals, society and media artifacts.

Between 2002 and 2006, Sydney artists Soda_Jerk – aka Dominique and Dan Angeloro – collaborated with video, sound and installation artist Sam Smith to produce Pixel Pirate II: Attack of the Astro Elvis Video Clone. This sixty minute “sci-fi / biblical epic/ action movie with a subplot of troubled romance” (sodajerk.com.au/sj/ppii.html) is entirely – and illegally – constructed of samples from Hollywood film, television, popular music, audio tracks, studio trademarks, DVD menus, copyright advertisements, games and online software. Using widely available digital software such as After Affects and Photoshop, Soda_Jerk together with Smith have “remixed” these samples into a narrative that challenges the economic and theoretical paradigm of the passive spectator. The film is set in the year 3001, where a team of Pixel Pirates formulate a plan to combat the evil tyrant Moses and his oppressive Copyright Commandments. In order to continue practicing the ancient art of remix they abduct Elvis Presley from 1955, create his video clone, who is then sent back to the year 2015 to assassinate Moses. By transforming into the Incredible Hulk, and later into the resurrected Jesus Christ, Elvis completes his mission, but only after he has overcome the Copyright Cops, and an assortment of action heroes including Indiana Jones from Indiana Jones and the Last Crusade (1989), the Ghostbusters from the 1984 film of the same name and its 1989 sequel, Daniel-san of the Karate Kid (1984), Luke Skywalker of Star Wars (1977) and Lara Croft of Lara Croft: Tomb Raider (2001).

Trevisaunt_1

Figure 1: Courtesy of the Artists

The process of remix or “mash-ups” is thematically rendered as well as formally employed in Pixel Pirate to narrativise the ways in which digital technologies are being utilised by “consumers” to “engage in self-expression and creative play” (Rimmer 2007, 8). The form and content of the film ultimately challenges the delineation of cultural production and consumption by highlighting the dynamic nature of media, and situates the spectator/consumer/citizen as an agent of narrative meaning.

Soda_Jerk’s sample and remix of filmic icons into an anti-establishment narrative in Pixel Pirate is indicative of how the relationship between cultural production and consumption is being affected by widely available digital technologies. In The Language of New Media, Lev Manovich asserts that: “As we work with software and use the operations embedded in it, these operations become part of how we understand ourselves, and others, in the world. Strategies of working with computer data become our general cognitive strategies” (2001, 118). Manovich’s uses the term selection instead of sample to indicate how “in computer culture, authentic creation has been replaced by the selection from a menu” or a database of ready-made parts (2001, 124). He uses the term compositing, whereby the selections made are blended to “create continuous spaces out of disparate elements” to show how remix is influenced by the advent of digital culture (Manovich 2001, 155). This process of selection and compositing is explicated in the companion booklet to the Pixel Pirate DVD. In the chapter entitled “Shot Breakdowns: #2 The Final Showdown” a single frame from the film’s sequence in which Elvis as the Incredible Hulk is being vanquished by the Ghostbusters is shown to be a composite of six images – or parts thereof (see figure 1).

Trevisaunt_2

Figure.2: Courtesy of the artists

The setting is a mash-up of the Paramount Studio’s logo, the ominous skyline from the conclusion of Donnie Darko (2001) and the desert from The Ten Commandments (1956). The crowd of debaucherous spectators and Moses are also from The Ten Commandments, whilst the Ghostbusters are taken from the 1984 film Ghostbusters, and the Incredible Hulk from the Hollywood incarnation of the comic book character in the 2003 film Hulk. The process of selection and compositing inherent to remix is shown by Soda_Jerk to be “transformative”, it remediates artistic forms authored by others in order to create a new product with a different – though related – set of cultural meanings (Rimmer 2007, 140). For instance, by including the Paramount logo in the composition of the film’s final showdown between champions of copyright law and its adversaries, Soda_Jerk manufacture a meta-narrative space (Manovich 2005) that articulates how Hollywood studios are a site of cultural production inhabited by their creations as well as spectators. Consequently, digital media “become simultaneously technical analogs and social expressions of our identity, we become simultaneously both the subject and object of contemporary media” (Bolter and Grusin 2000, 243). The Paramount logo ordinarily appears as an extra-diegetic element at the commencement of a given film to signify authorship and ownership, however in Pixel Pirate Paramount is shown to be only one component of the cultural landscape. Soda_Jerk utilise the operations of digital culture to understand the legacy of copyright law, who it protects, and how this affects the ability of individuals to engage with cultural artifacts.

Although the operations specific to digital software offer new methods and techniques for engaging with and producing filmic narratives, terms such as selection and compositing are not dissimilar to the techniques of postmodernism such as bricolage and parody. Manovich’s statement that “authentic creation has been replaced by the selection from a menu” echoes the argument forwarded by Frederic Jameson in his 1985 essay “Postmodernism and Consumer Society”. Here Jameson argues that the remediation of popular images annihilates the original referent which both jeopardizes historicity through over-mediation and retards the development of an aesthetic that is able to represent “our own current experience” (1985, 117). Following Jameson, Manovich posits that the process of selection naturalises “the flow of a different logic” which displaces the practice of “creating from scratch” (2001, 129). Although I agree with Manovich’s argument that the operations of selection and compositing have become a part of how we understand ourselves and others in the world, his assertion that an “authentic” form of authorship has been displaced is ultimately a utopian myth that he has inherited from the postmodern theory of Jameson. In Hot Spots, Avatars, and Narrative Fields Forever – Bunuel’s Legacy for New Digital Media and Interactive Database Narrative, Marsha Kinder refers to the operations of digital software as a “database aesthetic”, and articulates that this aesthetic does not alter communicative practices in any fundamental way, but rather “exposes or thematises the duel processes of selection and combination that lie at the heart of all stories and that are crucial to language” (2002, 6). In other words, selection and combination/ compositing/remix is an inherent component of both language and authorship. However, what has changed is how “new digital media and their critical discourse encourage us to rethink the distinctive interactive potential of earlier narrative forms” (Kinder 2002, 6). The ability to replicate, fragment and dismember cultural artifacts, and then remix, re-master, re-imagine, remediate and re-member that media in multifarious combinations not only generates alternative narratives, histories and memories, but also indicates the dynamic quality of media that has already entered the public consciousness.

The process of remix, particularly the operation of selection, used to construct Pixel Pirate elucidates the interactive and experiential nature of film spectatorship. That is to say that the remix reflects the ways that “film [already] circulates in fragmented form throughout not only the exterior landscape of popular culture, but also the interior landscape of the mind” (Columpar 2006). In order to vanquish Moses, Elvis is transformed into the Incredible Hulk.

Trevisaunt_3

Figure 3: Courtesy of the Artists

However, before he is able to complete his mandate, he is annihilated by the Ghostbusters. Here Soda Jerk attribute fragments of disparate films to a single body, collapsing the distinction between screen and spectator, product and consumer. Having foreseen this sticky end, the Pixel Pirates have programmed the Elvis clone so that he will resurrect in three days in the guise of Jesus Christ. The resurrection of Elvis plays upon the cultural myths and conspiracy theories that claim that Elvis did not die on the 16th August 1977. The manifestation of Elvis as a Christ figure parodies his mythical status as “The King”, and the religious dedication of his fans which has kept his image alive for the thirty-one years since his death. In the DVD booklet, Soda_Jerk explain that “[o]ur hero is not the ‘original’ Elvis; it is the Elvis phenomenon – the figure multiplied, mashed and endlessly imitated.” Soda_Jerk utilise the image of Elvis as a symbol of the “ancient art of remix”, which illustrates Kinder’s assertion that the process of selection and combination precedes digital technologies. Although digital technology enables the reproduction, selection and compositing of canonic images and texts, the selection of Elvis as the protagonist of Pixel Pirate signals these operations as a legacy of pre-existing forms of parody, fandom and spectatorship; interactive practices that belie the seemingly hermetic narrative structure of traditional cinema.

Pixel Pirate exemplifies the ways in which artists and consumers challenge the binary relationship of authorship and spectatorship by drawing attention to the character, function and possibilities of imaging and audio technologies in the digital age. In the DVD booklet, Soda_Jerk define remixing as a “conceptual frontier that collapses the archaeology of contemporary commodity culture with the science of time travel”, one which reassembles the fragments of a bygone era to recognise “the hidden forces contained within the outmoded artifacts and myth-systems of the recent past”. Soda_Jerk echoes archaeologist Juan Antonio Barcelo’s assertion that archaeologists and historians are “not looking for objects, but actions which produced objects with special features” (2007, 437). Like archaeologists of a more traditional ilk who use archaeological data “to understand the dynamic nature of present society” (Barcelo 2007, 437), Soda_Jerk understands that the legacy of film history bears upon the ideological conditions and embodied experience of individuals in the present. As Paul Arthur asserts, discussion of history in relation to digital technology is “generally dominated by the very practical aspects of information preservation and retrieval” (2006). Soda_Jerk’s narrativisation and act of copyright infringement treats media samples as found cultural artifacts and reassembles them to illustrate the tension that exists between practices of production and consumption, and history and memory in the digital era. In the DVD booklet Soda_Jerk qualify their practice of remix:

To clear the vast number of samples involved in this project would not only have been astronomically time consuming but also financially impossible. The present cost of sample licensing is notoriously prohibitive…This situation places the art of remix squarely in the hands of those with money – branded artists and corporate advertising. A depressing fate which owes its evolution to fan communities, the avant-garde and Afro-diasporic audio cultures…copyright is not just about cash, it’s also about control. Money doesn’t buy you sample rights unless you’re using those samples in a way that is pleasing to the proprietor (i.e. not mashing Elvis with Jesus). The battle over copyright then is also the battle over history – what is at stake is the very relationship of the past to the present.

Soda_Jerk’s characterisation of copyright as a battle over history reflects the positions of cultural theorists Alison Landsberg (2004) and Marita Sturken (1997), who characterize the immediacy of the moving and photographic image in contemporary culture as inextricable from personal memory, cultural memory and official history. Sturken offers the example of veterans of World War II whose experience of battle have been subsumed “into a more general script” as a result of watching Hollywood movies that dramatise the war (1997, 6). This example exemplifies how personal experience of media is inextricable from lived experiences, and how a relationship to personal history is compromised by laws that prohibit an active engagement with and use of culturally produced audio-visual technologies. By remixing samples from discreet and disparate media texts into the body of a single text, Soda_Jerk illustrate how “texts decreasingly take the material form of durable marks inscribed on paper and increasingly manifest themselves as electronic polarities, the bodies within (and without) electronic documents undergo correlated transformations in embodiment” (Hayles 2004, 257). Like bodies that remember the disparate temporalities of viewing this or that film – memories which are formative of individual experience and identity – Pixel Piratelike other remixes and mash-ups come to represent this postmodern experience of being in a world mediated by audio-visual technology.

Despite this philosophical affinity with archaeological practices, Soda_Jerk exceed the archaeological mandate and employ digital technology to creatively fragment and reassemble popular cultural media and propel the past and present “into a new constellation”, a process that they describe as “retro-futurism”. This new constellation reveals how the new technological frontiers of cinema depend upon the “reflexivity of embodied spectatorship” and not “fantasies of disembodiment and absorption into virtual worlds” (Rabinovitz 2004, 100). Landsberg contends that the affective traces left by experiences of spectatorship facilitate the “conditions for ethical thinking precisely by encouraging people to feel connected to, while recognizing the alterity of the ‘other’” (2004, 9). Landsberg here situates herself in opposition to Jameson by arguing that it is the age of consumerism, of technological reproducibility, that enables the cinema to facilitate a political action because the experiential nature of spectatorship dissolves the differences between authentic and mass-mediated memories (2004, 15). Although Landsberg’s own focus is the potential of cinema to form political alliances between marginalized communities, her recourse to embodied experience to argue that mass reproduction in late-capitalist culture is precisely what enables a political cinema is coextensive with the position articulated by Soda_Jerk. However, Soda_Jerk claim mass-produced visual and aural images as a personal and cultural history, and utilise these images to render a database narrative that subverts the dominant narrative of the passive spectator. By remediating cultural images, Soda_Jerk adhere to Walter Benjamin’s characterization of history which states that to “articulate the past historically… means to seize hold of a memory as it flashes up in a moment of danger…which unexpectedly appears to a man singled out by history” (1968, 255). Benjamin regards the subjective and inter-subjective nature of memory as a potent political weapon for affecting “both the content of tradition and its receivers” (1968, 255). The database aesthetic and digital software are utilised in Pixel Pirate to open up narrative possibilities: the act of remix triggers personal memory, cultural memory and official film histories to claim media as a dynamic cultural experience.

As illustrated by Kinder, the rhetoric of digital software operations has offered a new language of interactivity that is able to re-imagine the spectator as a site of cultural production. Furthermore, as was elucidated through an analysis of Pixel Pirates, digital software has offered a new means of expressing the interactive relationship shared between individuals, society and various media. By illegally sampling copyright works using widely available digital software, Soda_Jerk and Smith also exemplify the political potential of contemporary media, directly challenging the status quo. In the DVD booklet, Soda_Jerk conclude by stating:
“The remix is nothing less than a politics of time, and one worth the battle. We believe that we have used each of the samples fairly. But whether our sampling constitutes an act of “fair use” is a matter we can discuss with your lawyers”. What emerges in the stated politics of Soda_Jerk is a tension between the individual and cultural experience of media, and economic and histrionic power structures that rely upon a strict delineation of production and consumption. Pixel Pirate illustrates how access to, and expression through cultural artifacts is an essential means of understanding contemporary conditions of existence. This is due to the immediacy of audio-visual media in consumer culture, and its affective nature. Remixes and mash-ups utilise digital technologies in a manner that elucidates the ways that bodies are transformed by, and in turn transform, media.

Bibliography

Arthur, P. 2006. “Multimedia and the Narrative Frame: Narrating Digital Histories”. Refractory: A Journal of Entertainment Media 9 (July), http://blogs.arts.unimelb.edu.au/refractory/2006/07/04/multimedia-and-the-narrative-frame-navigating-digital-histories-paul-arthur/

Barcelo, J. A. 2007. “Automatic Archaeology: Bridging the Gap Between Virtual Reality, Artificial Intelligence, and Archaeology.” Theorizing Digital Culture: A Critical Discourse, edited by Fiona Cameron and Sarah Kenderdine, 438-454. Cambridge, MA, USA: Massachusetts Institute of Technology.

Baudry, J. 1986. The Apparatus: Metapsychological Approaches to the Impression of Reality. In Narrative, Apparatus, Ideology: A Film Theory Reader, edited by Phillip Rosen, 299-318. Originally published in 1975 in Communications (23) and translated in 1976 Camera Obscura (Fall), (1):104-28.

Benjamin, W. 1968. “Theses on the Philosophy of History”. In Illuminations, 253-264. New York: Harcourt Brace Jovsnovich Inc.

Bolter, J. D. and R. Grusin. 1999. “The Remediated Self”. In Remediation, 230-241. MIT University Press.

Columpar, C. 2006. “Re-Membering the Time-Travel Film: From La Jetee to Primer”. Refractory: A Journal of Entertainment Media 9 (July),
http://blogs.arts.unimelb.edu.au/refractory/2006/07/04/re-membering-the-time-travel-film-from-la-jetee-to-primer-corinn-columpar/

Hayles, K.N. 2004. “Bodies of Texts, Bodies of Subjects: Metaphoric Networks in New Media”. Memory Bites: History, Technology and Digital Culture, edited by L. Rabinovitz and A. Geil, 257-282. Duke University Press.

Jameson, F. 1985. “Postmodernism and Consumer Society”. Post-Modern Culture, edited by Hal Foster, 111-125. Pluto Press.

Kinder, M. 2002. “Hot Spots, Avatars, and Narrative Fields Forever – Bunuel’s Legacy for New Digital Media and Interactive Database Narrative”. Film Quarterly (55): 2-15.

Landsberg, A. 2004. “Introduction: Memory, Modernity, Mass Culture”. Prosthetic Memory: The Transformation of American Remembrance in the Age of Mass Culture, 1-24. New York: Columbia University Press.

Manovich, L. 2001. “The Operations”. The Language of the New Media, 116-175. Boston & New York: Massachusetts Institute of Technology.

Manovich, L. 2005. “Understanding Meta-Media”. 1000 Days of Theory (October), www.ctheory.net/articlaes.aspx?id=493

Metz, C. 1983. “The Imaginary Signifier”. Psychoanalysis and Cinema: The Imaginary Signifier. The MacMillan Press: London.

Mulvey, L. 1977 [1975]. Visual Pleasure and Narrative Cinema. In Women and the Cinema: A Critical Anthology, edited by K. Kay and G. Peary, 412-428. New York: Dutton.

Rabinovitz, L. 2004. “More Than Movies: A History of Somatic Visual Culture through Hale’s Tours, Imax, and Motion Simulation Rides”. Memory Bites: History, Technology and Digital Culture, edited by L. Rabinovitz and A. Geil, 99-125. Duke University Press.

Ruston, S. 2006. “Blending the Virtual and the Physical: Narrative’s Mobile Future?” Refractory: A Journal of Entertainment Media 9 (July), http://blogs.arts.unimelb.edu.au/refractory/2006/07/04/blending-the-virtual-and-physicalnarratives-mobile-future-scott-ruston/

Rimmer, M. 2007. Digital Copyright and the Consumer Revolution: Hands Off My iPod. Cheltenham, UK, Northampton, MA, USA: Edward Edgar.

Sturken, M. 1997. “Introduction”. Tangled Memories: The Vietnam War, the AIDS Epidemic, and the Politics of Remembering, 1-18. Berkley, Los Angeles, London: University of California Press.

Vesna, V. 2000. “Database Aesthetics”. AI & Society (14):155-156.

Filmography

Donnie Darko. Directed by Richard Kelly. 2001.
Ghostbusters. Directed by Ivan Reitman. 1984.
Hulk. Directed by Ang Lee. 2003.
Indiana Jones and the Last Crusade. Directed by Steven Spielberg. 1989.
Karate Kid, The. Directed by John G. Avildsen. 1984.
Lara Croft: Tomb Raider. Directed by Simon West. 2001.
Pixel Pirate II: Attack of the Astro Elvis Video Clone, Soda_Jerk and Sam Smith, 2002-2006.
Soda_Jerk. (Cited 7 November 2008). Available from http://sodajerk.com.au
Star Wars: Episode IV – A New Hope. Directed by George Lucas. 1977.
Ten Commandments, The. Directed by Cecille B. DeMille. 1956.

Notes

[1] Copyright is a pertinent issue in relation to new digital technologies; however it is a concern that is tangendental to the focus of this essay. For a detailed analysis of how copyright laws in Australia and the United States impacts upon remix culture see Rimmer, (2007).

[2]“Fair use” is a grey area in copyright law in both Australia and the United States. At present it covers transformative uses such as parody, however its extension to cover mash-ups is still a largely contested area. See Rimmer (2007).

Author Bio

Amanda Trevisanut is a PhD candidate in the Department of Culture and Communication at The University of Melbourne. She is currently working on her thesis entitled ‘Multi-Cultural Identity and SBS Commissioned Content’.

Contact Email: a.trevisanut@pgrad.unimelb.edu.au