‘They’re Here’: The Rhythmic Accent, the Single Beat and Rhythmic Silence – Annie Turner

Abstract: This article examines the significant role of diegetic aural rhythm in cinema as a tool for structuring narrative time. It explores the ways in which diegetic aural rhythm delineates narrative time by establishing patterns of expectation and resolution for the cinema audience. Using Peter Jackson’s The Two Towers as the primary text, it combines musicological concepts of rhythmic time with film sound theory to produce a framework for analysing key rhythmic sequences.

Lord of the Rings: The Two Towers is a richly layered film that exhibits a masterful use of diegetic aural rhythm. With a narrative that fluctuates broadly in pace, shape, and momentum, the audience is immersed in a dynamic, shifting temporal world governed by the force of rhythmic sound. Jackson recognises the structuring power of aural rhythm and places it in a prominent position in the filmic world so that its impact is noticed. A fundamental musical principle, rhythm moves us emotionally because it appeals to our subconscious desire for pattern and the security of repetition. Diegetic aural rhythm also has the potential to frighten us by toying with our expectations of narrative resolution. A critical tool in the creation of cinematic suspense and climax, diegetic aural rhythm is integral to the successful dramatisation of key scenes in The Two Towers.

David Epstein claims that the accented beat defines a significant temporal point in the cinematic narrative. The accented beat is a ‘ stimulus’ perceived as a ‘ focal point’ in a rhythm as ‘ stronger’ and ‘ more prominent’ than surrounding beats (1995: 24). Michel Chion recognises that in cinema, the accented beat is a point of audiovisual synchronisation, a ‘ salient moment’ in the cinematic sequence (1994: 58). According to Brophy, diegetic film sound synchronises aural and visual action onscreen to ‘ generate a sensation of movement’ for the audience and propel narrative (2002). In cinema, the sound of impact is essential to the creation of narrative action. Chion explains that the accented beat is a ‘ momentary, abrupt coincidence’ of aural and visual impact, punctuating the cinematic sequence. In a rapid, intense action sequence, the moment of impact is the central point of narrative construction, and Chion posits that. The audience anticipates the point of impact before it occurs, and reels from the ‘ shock’ of its meaning and repercussions after it occurs. If it were not for the accented beat, the fleeting image of violent action could not be remembered or even identified. Chion argues that the accented beat defines sound that implants itself in the audience’ s consciousness, where its echoes are felt. The strong beat creates aural and visual continuity, and enables surrounding narrative time to fluctuate and develop (1994: 60-62).

Cinematic battle sequences demonstrate some of the most intense examples of the accented beat as audio-visual synch point. Spectacular war films exhibit lengthy sequences of violent action, which, as Alan Williams argues, are given a temporal position through accented sound effects such as punches, gunshots, explosions, arrow shots, axe blows and other impact sounds (1999: 239). According to Chion, the ‘ physical blow’ is the ‘ most immediate and brief meeting’ of two represented bodies, and the ‘ most immediate audiovisual relationship’ is the synchronisation of an aural impact and a visual impact. The accented beat allows the audience to ‘ hear what we haven’ t had time to see’ , so that ‘ we believe we have seen’ the instantaneous impact. He notes that action sequences demonstrate ‘ strong points of synchronisation’ , as ‘ blows, collisions, and explosions’ mark significant time points in the narrative. The accented beat tightly binds the array of otherwise disconnected visual moments in a coherent succession, and if it were not for the ‘ rapid auditory punctuation’ of the accented beat, the many ‘ rapid visual movements’ in the action sequence would disorient the audience. (1994: 11-61). This uniformly used cinematic technique is parodied in Gore Verbinski’ s Pirates of the Caribbean (2003). In the scene in which pirate Captain Jack Sparrow duels with William Turner in the blacksmith’ s shop, each of their deft blows is matched by a subtle accented beat in the musical score. Each narrative impact corresponds to an extradiegetic musical accent, a self-reflexive technique that strengthens the bond between violent narrative action and scored music. The music works in tandem with the diegetic aural rhythm of the sword fight to structure narrative in temporal bursts, according to the movements of the main protagonists.

G. W. Cooper and L. B. Meyer explain that the accented beat stands apart from other beats as a ‘ focal point’ , or ‘ nucleus’ of rhythm (1960: 8). Epstein demonstrates how accent segments time by emphasising regular points in rhythm (1994: 24), and Richard Wagner argues that rhythmic accent communicates emotively to the auditor, appealing to the listener’ s instinctive rhythm perception (1995: 264). Chion claims that the accented beat creates temporal structure for the action sequence, because the images alone do not always clearly demonstrate narrative succession. We may interpret these images as either ‘ simultaneous or successive’ , however, the accented beat provides the sequence with definite ‘ linear and sequential’ time. Chion explains that the accented beat embodies and propels time, it is precisely located and irreversibly engaged, with each beat a contained narrative that consists of an attack and fade. He argues that diegetic sounds punctuate narrative time and contribute to the creation of a film’ s defined temporal dimensions, driving the narrative rhythm. Chion states that when an accented beat is matched with the image, it directs our attention along a narrative trajectory, and can deceive us into believing we have seen a visual movement that does not exist (1994: 11-55).

The epic battle scene at Helm’ s Deep consists of a series of violent images. Each brief image contains a central blow, shot, slice or stab – an impact that forms the core of a plethora of background collisions. With each visually presented point of conflict, the diegetic soundtrack provides a strong beat that accents the temporal flow and draws attention to the narrative significance of the accented moment. The synchronised sounds and images in the battle sequence form a rhythm of regularly recurring beats, and every prominent beat drives narrative time throughout the course of the battle, yet does not indicate or anticipate a point of narrative resolution. Like the mechanical pulse, the rhythm of accented beats in the battle sequence is highly cyclical. It does not demonstrate rhythmic development because it consists of the constant repetition of the same accented beat with no significant fluctuation of meter or tempo. However, unlike the mechanical pulse, the rhythm of the battle sequence is not rigid or precise, and subsequently creates an unpredictable narrative. In the battle sequence at Helm’ s Deep, the audience is presented with an array of visual and aural stimuli that illustrates the intensely chaotic activity of the scene. In order to draw a narrative temporal thread through the sequence, Jackson highlights central actions within each shot with a single beat. For example, we are encouraged to focus on Aragorn’ s sword strikes, Gimli’ s axe blows, and Legolas’ arrow shots because these are the actions of the main protagonists. Their experiential events are pinpointed and made salient by the dominant, accented beat.

The dominant, accented, single beat marks a significant timepoint withi
n the cinematic sequence that separates it as a focal point. The synchronisation of visual and narrative action with a single aural event places that action in time. The visual image of the action provides us with evidence of its occurrence, but its aural match concretes the event’ s location at a temporal point, because sound is a temporal phenomenon and validates the motion of time. The single beat places an emphasis on the timepoint it represents and highlights the narrative action occurring at that moment. The scene in which the hobbits arrive at the Dead Marshes is centred on the point of action at which Frodo sees the eyes of a ghoul open suddenly, and transfixed, he topples into the pond of bodies, going ‘ down to meet the dead’ . At the precise moment when the eyes blink open, we simultaneously hear a dull thud. The thud is not located within the diegesis, as it is not a realistic representation of the sound of eyelids opening, nor is it part of the musical score. Perhaps the sound is oneiric (Milicevic 2002), reverberating in Frodo’ s psychic soundscape while he is mesmerised by the fallen warriors’ spirit world. The diegetic soundtrack is minimal throughout the sequence. There is no regular rhythmic pulse to establish the pace of this sequence, and instead we hear vocal apparitions echoing like aural memories of past tragedies. The voices are not rhythmically organised, and we become disoriented in the hypnotic scene without clearly defined time points to use as reference. We are unable to perceive solid aural temporal markers, and receive scant information from the lethargic images of the zombified Frodo, and so we now sense an eternal time. When the single beat occurs, it gives the auditor a solid perceptual marker of narrative continuation, and releases the audience from their entrapment in the narrative lull.

The single aural event firmly grounds our auditorial perception in quantified time that exists only through its demarcation. Once the beat has occurred, the auditor relaxes, reassured by their regained grasp of narrative temporality. The single beat gives the auditor a reference point with which to navigate the timeless measure that follows the single beat. When the ghoul’ s eyes blink open in a single beat, Frodo topples into the marshes, and enters the watery underworld of living corpses that grasp him and try to claim him. The sequence is matched with aural intensity as the loudness of the ghostly voices increases. The single beat marks the point at which Frodo gives up his fight to stay awake and in control. In one relieving moment, he gives up his agency and responsibility. Like the diegetic soundscape preceding the single beat, the sound following the single beat is not organised rhythmically, an effect that mirrors the original sense of timelessness while framing the single beat. Without the occurrence of the single beat, the auditor would have little to guide her through the disorienting, static temporality of this sequence. The single beat synchronises the aural and the visual worlds of the cinema, allowing the audience to navigate their way through cinematic time.

The single beat culminates narrative flow into a temporal climax. The occurrence of the dominant single beat in a film sequence suggests that the highlighted timepoint is the most prominent and meaningful moment in the sequence. The unmarked time preceding the single beat contributes less to the narrative action, only leading us in anticipation towards the defined, significant event. Similarly, the unmarked time following the single beat contains minimal narrative action; as Brophy attests, it simply exists as a comparatively neutral temporal measure within which the salience of the single beat is allowed to resonate (2002). The single beat resolves narrative tension builds to its climax. Following the single beat, narrative flow resumes without tension. The single beat is the peak of a sequence, focussing our attention of the exact moment that it occupies, and is what Kramer describes as a rhythmic ‘ tonic’ or ‘ cadence’ (1988: 25-26), a moment that is anticipated, expected, and which returns the temporal sequence to it’ s point of resolution.

The climactic single beat is a staple technique of the horror genre, frequently used to mark the point at which the killer attacks their victim. Often the attack is not actually seen by the audience, but its aural equivalent provides them with knowledge of the moment of impact.

The single beat also occurs in the scene at Helm’ s Deep. The Uruk-hai have come to a standstill at the foot of the fortress. As they beat a crude pulse as a threat to the fort’ s defenders, narrative tension builds in anticipation of the ensuing action. The warriors are mentally preparing for war – the Uruks are hungry for the violence that is their birthright, and the alliance of men and elves are awaiting their order to die defending their freedom. The cinematic spectator knows she will witness a spectacular battle scene, but cannot predict when it will begin. We see a human archer shake and sweat with the unbearable anxiety of suspense, his actions mirroring the audience’ s great tension at this moment. Suddenly, the Uruk’ s primitive pulse doubles in tempo accelerating towards the impending climax of the first attack, and the terrified archer breaks under pressure, releasing his bow. The arrow shoots across the battlefield and impales an Uruk-hai warrior through the slit in his helmet. The violent action of the arrow as it smacks into the brute’ s face is matched by a single dynamic sound, one that describes the force of the arrow’ s impact, the visceral qualities of the new wound, and the dreadful significance of the archer’ s action. The single beat is a narrative climax, both anticipated and clearly remembered. The climax is a temporal peak, as the most important point in the narrative. The single beat determines the nature of the surrounding time, and in turn the surrounding time supports the status of the climactic single beat. The climactic single beat silences the entire Uruk-hai army, and the defenders now realise that this is the moment that begins their great battle. The single beat draws narrative temporality around it; the suspense of extended time precedes it, and the flow of temporal relief follows it in silence.

Another example of the climactic single beat is in Laurent Tuel’ s 2001 film Un Jeux D’ Enfants (The Children’ s Game). The mother, concerned that evil spirits possess her children, stands at her window and contemplates the sinister happenings in her home. The scene is tense, as we come to realise that the evil children are trying to drive their mother insane. The diegetic soundtrack is quiet, except for a barely audible hum. The aural continuum stretches narrative time, until suddenly a dull aural pulsation cuts of the tense moment with a single beat at the exact moment that the lights in the house are switched off. The single beat is climactic, as it is the point at which maximum tension culminates and causes the temporal release of narrative tension.

The single beat marks a distinct temporal shift that divides the film sequence into two different modes of time. The single beat is a point of synchrony that is highly important because it signifies a point of transition in the film sequence. The transitional point is not necessarily a climax, but it changes the pace and emotion of a scene. The temporal hierarchy upheld by the single beat dictates that the unmarked time leading up to the single beat will be suspended ‘thick with the tension of anticipation’ and the unmarked time following the single beat will exhibit a contemplative retrospection in the wake of the dispelled tension. The single beat pinpoints the most significant point of action within the film sequence, and
marks the shift from one mode of time to another, the separation of the before and the after. In the battle to save Osgilliath, Faramir keeps Frodo and Sam hostage, hoping to take possession of the Ring and use its powers to save his people from Sauron’ s genocide. In a desperate attempt to make Faramir understand the dangers of possessing the Ring, Sam shouts, ‘You want to know what killed Boromir? You want to know why your brother died? He tried to take the ring from Frodo! The ring drove your brother mad!’ At the moment when Faramir realises the significance of Sam’ s words, the man looks skywards and spies the Ringwraiths, mounted on their great reptilian steeds. Understanding all, Faramir cries, ‘ Nazgul!’ The earth shaking crash of a falling boulder punctuates this desperate, furious call. The boulder falls on the second syllable of Faramir’ s word of warning, accenting the moment and giving the word a dramatic, poetic existence.

The dominant single beat of the boulder crash marks a point of temporal transition within the narrative. The beat occurs at the point of the Ringwraith’ s arrival at Osgilliath, matching the word ‘ Nazgul’ and encouraging the auditor to contemplate the narrative meaning of this moment. The time point marked by the single beat is pregnant with meaning – the Ringwraiths are approaching, and if the ring is lost to Sauron then Middle Earth is lost forever. The moment before the single beat is tense with Sam’ s desperation and Frodo’ s psychic trance, which creates an unprogressively cyclical time as the sequence alternates between realistic sound and subjective sound. Following the single beat’s significant time point, the battle begins and the narrative flows toward its consequence.

Silence is integral to the temporal and rhythmic placement of the single beat, an aural phenomenon that exists with great effect as a backdrop for a single rhythmic sound. When all other sounds fall away to reveal a single, ‘ previously…unnoticed’ sound, the surrounding silence forces the single sound to appear louder until, ‘ paradoxically’ , we are left with an ‘ anxiety-producing impression of silence’ which is strengthened because the single sound is intensely ‘ heightened’ by the silent background. The silence creates ‘ emptiness in a terrible way’ (Chion 1994: 58) and surrounds the single beat to intensify its salient meaning.

Jennifer Judkins argues that silence can both anticipate an ensuing aural event, and provide a final retrospection after the aural event (Judkins 1987: 54). In traditional musical pieces, silence often separates sections, however the ‘ abundance of silence in contemporary music’ and other aural art forms, like cinema, are ‘ characteristic of a culture fascinated with compartmentalizing and manipulating individual, societal, and physical time’ . Silence is significant in the aural text because silence is fundamentally related to time, and ‘ it is only through time that we can express rhythm, the building and releasing of tension, fulfilment and frustration’ (Kramer 1973 cited in Judkins 1987: 34-35).

Silence is a highly expressive medium for aural communication, with silences often creating ‘ gestures more eloquent and expressive than sound, especially at tense, critical moments’ and frequently reflecting the ‘ dramatic quality’ of the sounds they frame (Dietz 1981 cited in Judkins1987: 36). Rhythmic silence is a constructed, artificial silence, imbued with the meaningful, ‘ resounding character’ of the rhythmic ‘ context’ it inhabits. Silence is infused with meaning, pregnant with the nature of the rhythm it is juxtaposed with (Judkins 1987: 35-36). Deliberate silence in art differs from natural silence because it is ‘ characterized’ ‘ by the tonal and rhythmic material’ that surrounds it (Judkins 1987: 36). Like rhythm, silence has an emotive effect. It was used in ancient military operations, when ‘ the silence of the [military] band was taken as a proof that a battalion had been broken’ and that there was a danger of defeat (Arnold 1993: 5-6). Silence performs different roles in the diegetic soundtrack, according to their placement in the aural sequence. Silence can be ‘ retrodictive’ , operating ‘ in reference to the material preceding’ it, or ‘ predictive’ , anticipating an ‘ oncoming event’ (Judkins 1987: 52).

Retrodictive silences can ‘ answer or echo the previous event’ , or replace the ‘ final event’ , maintaining rhythmic momentum and extending time indefinitely, by denying the ‘ cadence or note of finality’ of the piece (Judkins 1987: 58). Retrodictive silence can also create ‘ tension and suspense’ within a piece when it unexpectedly severs a ‘ cadential musical statement’ , that had promisee to resolve with finality (Judkins 1987: 64). Silence creates time where pieces of silent nothingness allow a piece to ‘ retain the form’ of narrative time (Judkins 1987: 33). Silence in a rhythmic piece binds, defines and highlights rhythmic time (Judkins 1987: 37). According to Chion, ‘ silence is never a neutral emptiness’ , but ‘ the negative of sound we’ ve heard beforehand or imagined; it is the product of a contrast’ (Chion 1994: 57). Towards the end of the scene at Helm’s Deep, when it seems as though the battle will be lost, Jackson suddenly reduces the volume of the diegetic sound, leaving only the accented impact sounds to echo slightly in the now almost empty soundscape. While this sequence is not entirely silent, the considerable reduction in diegetic sound serves to shift the mode of narrative time. The sparse soundtrack gives the sense that time has slowed, because there are fewer aural events occurring than before. The slowed time removes the audience from the intensely experiential time of the battle sequence, and places them in a more objective position where they observe narrative flow presumably grinding to a tragic and fatal halt. Philip Brophy explains that the absence of an aural marker of time in a cinematic sequence denies the auditor a tool with which to quantify the rate of temporal motion across a series of visual images. ‘Audiovisual normality’ is destroyed when the image track is left to fend for itself alongside a silent sound track, which is why silence tends to accompany extreme violence, the ‘ upsetting’ result is an established semiotic effect of morbidity which cinema usually avoids’ (2002).

There are many more incidents of diegetic aural rhythm in the scene at Helm’s Deep, for example the intricate polyrhythm of galloping horses that creates a multitude of individual time structures to expand the narrative vertically, and the arrhythmic avalanche of desperate voices that disorients the audience in temporal chaos. However, the accent, the single beat, and the rhythmic silence are perhaps the most important for a discussion of narrative and provide a solid elemental platform for examining all kinds of diegetic aural rhythm in the cinema.

The battle at Helm’ s Deep, the Dead Marshes, the attack on Osgilliath, and the Southron Army approaching the Black Gate of Mordor are pivotal sequences in Jackson’ s epic film, and are all founded on the temporal form created by diegetic aural rhythm. The mechanical pulse and the organic pulse create two contrasting temporal structures that embody the narrative threat of evil in the heroic tale, the accented rhythm provides the array of disconnected images with a coherent stream of narrative, the single beat culminates suspenseful cinematic time into a climactic transitionary point, and the rhythmic silence encourages the audience to subjectively experience the fluid motion of unmarked time.

Jackson’ s dynamic film brilliantly exhibits a sophisticated use of diegetic aural rhythm, bringing the rhythmic soundtrack to the forefront of the cinema experience. Previously overlooked by film theorists, diegetic a
ural filmmakers are increasingly employing rhythm as a powerful cinematic tool for creating narrative temporality and it is certainly a part of contemporary cinema that enthrals audiences.



Arnold, Ben. 1993. Music and War: A Research and Information Guide. New York: Garland.

Brophy, Philip. 2002. Soundtrack: Film Scores, Sound Design, Surround Sound. Melbourne: Royal Melbourne Institute of Technology. Media Arts Department. [cited 30 October 2002]. Available from World Wide Web: (http://media-arts.rmit.edu.au/Phil_Brophy/MMAlec/).

Brophy, Philip, ed. 1999. Cinesonic: The World of Sound in Film. Sydney: The Australian Film, Television, and Radio School.

Brophy, Philip, ed. 2000. Cinesonic: Cinema and the Sound of Music. Sydney: The Australian Film, Television, and Radio School.

Brophy, Philip, ed. 2001. Cinesonic: Experiencing the Soundtrack. Sydney: The Australian Film, Television, and Radio School

Chion, Michel. 1994. Audio-Vision: Sound on Screen. Trans. Claudia Gorbman. New York: Columbia University Press.

Cooper, G. W. and L. B. Meyer. 1960. The Rhythmic Structure of Music. Chicago: University of Chicago Press.

Epstein, David. 1995. Shaping Time: Music, the Brain, and Performance. New York: Shirmer Books.

Fraisse, Paul. 1987. A Historical Approach to Rhythm as Perception. Action and Perception in Rhythm and Music. Ed. Alf Gabrielsson. Stockholm: The Royal Swedish Academy of Music.

Gorbman, Claudia. 1987. Unheard Melodies: Narrative Film Music. Bloomington and Indianapolis: Indiana University Press.

Judkins, Jennifer. 1987. The Aesthetics of Musical Silence: Virtual Time, Virtual Space, and the Role of the Performer. Los Angeles: University of California Press.

Kassabian, Anahid. 2001. Hearing Film: Tracking Identification in Contemporary Hollywood Film Music. London and New York: Routledge.

Kramer, Jonathan D. 1988. The Time of Music. New York: Schirner Books.

Lander, Dan and Lexier, Micah (ed.s) (1990). Sound By Artists. Toronto: Art Metropole and Walter Phillips Gallery.

Milicevic, Mladen. 2002. Film Sound Beyond Reality: Subjective Sound in Narrative Cinema. [cited 23 September 2002]. Available on World Wide Web: (www.filmsound.org/articles/beyond.html.)

Mulvey, Laura. 1975. Visual Pleasure and Narrative Cinema. Screen. v. 16. no. 3. Autumn. pp. 6-18.

Parncutt, Richard. 1987. The Perception of Pulse in Musical Rhythm. Action and Perception in Rhythm and Music. Ed. Alf Gabrielsson. Stockholm: The Royal Swedish Academy of Music. pp. 127-138.

Ricoeur, Paul. 1983. Time and Narrative. Trans. K. McLaughlin and D. Pellauer. Chicago: University of Chicago Press.

Sachs, Curt. 1953. Rhythm and Tempo: A Study in Music History. London: J. M. Dent and Sons.

Wagner, Richard. 1995. Opera and Drama. Trans. William Ashton Ellis. Lincoln: University of Nebraska Press.

Weis, Elisabeth. 1982. The Silent Scream: Alfred Hitchcock’ s Soundtrack. New Jersey: Fairleigh Dickinson University Press.

Williams, Alan. 1982. Godard’ s Use of Silence. Camera Obscura. Fall. Los Angeles. Pp. 193-210.

Wood, Nancy. 1984. Towards a Semiotics of the Transition to Sound: Spatial and Temporal Codes. Screen 25:3:16-25.


Jackson, Peter. 2001. The Lord of the Rings: The Fellowship of the Ring.

Jackson, Peter. 2002. The Lord of the Rings: The Two Towers.

New Line Cinema. 2003. Revealing the Secrets Behind the Production of The Epic Adventure: Special Extended DVD Edition of The Two Towers.

Tuel, Laurent. 2001. Un Jeux D’ Enfants (The Children’ s Game).

Verbinski, Gore. 2003. Pirates of the Caribbean.


Author Biography

Annie Turner is a postraduate student in the Cinema Studies Program at the University of Melbourne. Her thesis research focuses on sound theory, and the exploration of sound as a mode of narrative meaning.