Introduction

Beryl de Zoete (1879–1962) left an extensive body of documentation on Balinese dance, comprising photographs, field notes, and films. Yet, her contributions have been largely overlooked in contemporary scholarship on Balinese dance from the 1930s. This paper engages with her notes to examine the interconnections between dance, imagery, and her lived experiences in Bali during the preparation of her publication. It further investigates how her films might contribute to a deeper understanding of her methodological approach to dance.

De Zoete’s legacy is preserved in two primary archives: the visual materials housed at the Horniman Museum in London and her field notes, which form part of the Arthur D. Waley papers at Rutgers University Library in New Jersey. Despite the richness of these collections, the abundance of contemporary sources from the 1930s, and the enduring impact of de Zoete and Spies’ publication, her moving image records remain significantly underexplored in scholarly discourse. These films are mentioned only in passing in the literature and have never been the focus of dedicated analysis—not even by de Zoete herself, as I discovered through a close reading of her field notes (Hitchcock, 1991: 49). This raises critical questions: In a film archive like de Zoete’s—implicitly structured around the concept of dance-as-movement, as suggested by her notes—what kinds of objects warrant study and preservation? Furthermore, what methods and interconnections among these objects are necessary to facilitate meaningful exploration and scholarly engagement?

Movement, both as an embodied process and an archival challenge, lies at the core of this paper. This approach conceptualizes film as a co-host of movement, inherently linked to the dancer’s body. Computer vision (CV), inspired by human cognition, embodies principles of materialism and visual phenomenology, enabling an integrated exploration of dance, film, movement, and inscription. This integration facilitates the analysis and reinterpretation of archival materials, transforming moving images into transmedia phenomena.1

Attention in the Archive – Place and Practice

Beryl de Zoete’s films are not the first images of the Balinese Legong dance we find in archives. The collection Photographed on behalf of science; exotic people between 1860 and 1920, housed in the National Museum of Ethnology in Leiden, The Netherlands, includes an Onnes Kurkdjian photograph titled Bali from around 1900. This photograph represents one of the earliest mechanically produced image records of Legong (Figure 1).

Figure 1
Figure 1

The earliest photographic image record of Legong found in the archives consulted for this paper dates from 1900 (Kurkdjian, 1900).

The metadata accompanying this image does not explicitly reference the word “Legong.” In fact, the metadata makes its identification uncertain; the annotated inscription in Dutch reads “Weg naar Garoet,” which translates to “Road to Garut.” Garut is a small town near Jakarta, known for its leather industry and quite distant from Bali. The metadata identification may therefore be unreliable: the inscription could have been added later by someone who purchased the image in Garut, perhaps at a flea market, or by one of the dancers or musicians, adding a written trace of affect to commemorate an important memory. Upon closer examination of the image, I search for clues to confirm my assumptions about the dance form. However, photographs are not particularly effective at conveying narrative, and nothing in the record metadata definitively reveals this as the first historical photographic image of Legong. I begin by examining the people depicted—a crowd of seated men behind gamelan instruments, with gongs visible on the right side of the frame. My attention is drawn particularly to two girls seated on chairs in the front row. I recognize their dresses, head crowns, and ornaments, and notice that they hold fans in their hands. These elements align with those of a gamelan ensemble, suggesting that the group is indeed a troupe of musicians accompanied by two Legong dancers.

How can we be certain that this image indeed represents Legong? To reverse engineer the image, I focus on identifiable material elements and mentally compare them to other images from various archives and collections, including de Zoete’s films and my own memories of Legong performances in Bali. Even though I miss the presence of a third dancer, the Condong, who always appears in contemporary performances of the Legong, the attire and date of the sitters fit with debates around the Legong form and its origins.2 Still, the dancers in the photograph remain hieratic, their gaze scrutinizing the viewer. The image is staged, posing immobile individuals before the cumbersome photographic technology of the late 19th century. The shallow depth of field and blurred faces—especially among the children and background figures—reveal the technical limitations of the era, including the long exposure times due to low light sensitivity and the wide shutter apertures needed. Despite these constraints, the scene retains a sense of realism. However, without observing the dancers in motion, fully grasping the dance form remains elusive without additional references. I wish that the photographic record was accompanied by a set of annotated documents—a blueprint, perhaps, describing Legong’s stylized movement patterns, directional forces, and body orientations—carefully preserved over a century from the ravages of tropical weather and the colonial appropriation of cultural artifacts.

In dance studies and performance studies more broadly, the tensions surrounding the definition of the object of inscription for an archive are epistemological. Dance researcher Rachel Fensham points out that there is “little agreement between scholars about what constitutes the proper object of dance” (Fensham, 2013: 146). Notation, drawings, diagrams, written prose, photographs, and moving images (film and video) have all been employed to create “objects” that become archived documents. Yet, prior to this selection process, the objects must exist or be created. Processes of objectification and classification are responsible for transforming such an object into a visual taxonomy within museum cultures.

Curatorship can be understood as a ritualized practice—one that parallels the rituals of dance or performance. Much like archival authority, curatorship involves the selective acts of inclusion and exclusion, wherein decisions about which objects or films to preserve, categorize, and display enact a form of ritualized power. Authority permeates this process, positioning curators as gatekeepers who determine which materials gain value and visibility within the archive. In the case of de Zoete’s films, the curatorial process shapes not only the narrative of Legong dance but also the ways in which embodied practices are represented and interpreted in archival form. In the following, I will elaborate on the idea of curatorship as ritual.

Performance scholar Diana Taylor offers a broader definition of the archive that encompasses not only the place and objects but also the practices of selection and organization, thereby challenging the boundaries of curatorial power. An archive, according to Taylor, is simultaneously “an authorized place (a physical or digital site that hosts collections), an object (a collection of things—historical records and representative or unique objects selected for inclusion), and a practice (the logic of selection, organization, access, and conservation through time that evaluates certain objects as archivable).”3 Exploring Taylor’s argument around the questions of place, materiality, and practice in relation to a digital archive necessitates considering how “digital variations seriously challenge the dominance and logic of the archive” (Taylor, 2019: 45). However, it is Taylor’s perspective in opposition to Benjamin, particularly her redefinition of aura around the process of selection and inclusion in the archive, that warrants deeper attention. She further expands the concept by detaching it from the status of the object itself—moving the focus from the “object” to the “means.”4

Taylor argues that the aura does not pertain solely to the object but is equally tied to the archiving process: “the aura then has as much to do with the nature of the selection process as with the status of the thing itself… it is not that the work of art is necessarily in the glass, but in the process of selection and valorisation.”5 Even when we abandon the notion of object-as-aura, we remain engaged in an auratic process of constructing the archive. Taylor’s discussion of the aura, however, is not about the archive as a physical place, nor the materiality and presence of the object; instead, it pertains to the archive as practice, the curatorial, the authority to include. The aura becomes reconfigured around the means—the process of selection that grants the power to make some things visible while rendering others obscure. Drawing on Foucault, Taylor defines this interplay of object, place, and practice as an “epistemic system of circular legitimization,”6 wherein these three elements converge to produce authority, which in turn legitimizes what can be known and said. In the digital archive, this system persists but shifts dynamically. The object is no longer confined to its physical form but exists as fluid data, accessible and subject to manipulation from remote locations. Similarly, the concept of place becomes dislocated, as digital repositories challenge the notion of a singular, fixed archive. Despite these transformations, the practice of archiving—what is selected, valued, and made accessible—continues to wield power, determining the visibility of certain materials while consigning others to obscurity.

The relationships between archive-object and archive-place become even more complex in digital environments. For the present discussion, it is crucial to recognize that while the digital archive offers new possibilities for interaction and analysis, it also reproduces many of the power structures inherent in traditional archival practices, manifesting a form of “colonization of knowledge.” The aura, as Taylor suggests, is no longer tied to the uniqueness of the object but instead emerges from the processes through which it is curated and contextualized within the archive.

This critique of curatorial authority invites reconsideration of how archival documents, particularly those created outside institutional constraints, can challenge established hierarchies. In de Zoete’s films, these hierarchies include the anthropological narrative—the discourse that frames how the films are catalogued and labelled—and the treatment of the films as static, unique, and immobile objects. Both of these hierarchies are subject to critique and challenge.

Returning to de Zoete’s archive, the digital copies of de Zoete’s originals facilitate a detailed analysis of the dancers’ movements depicted in these images. This process demands a reconceptualization of the digital object, which exists as a network of numerical data rather than as a singular, unchanging entity. Instead of perceiving the high-definition (HD) archived object as static or inflexible, it should be understood as a fluid and malleable collection of data capable of generating conjunctions and interacting with other datasets. This fluidity is more in line with the ethos of dance or performance, which contrasts sharply with the immobility and fixed nature of physical artifacts. This difference stands in stark opposition to traditional performance-related objects—such as costumes, crowns, ornaments, and mise-en-scène elements—as well as archival materials like vintage photographic prints, field notes, and floorplans.

The redefinition of the archival object as fluid and adaptable necessitates a corresponding archival practice that embraces this fluidity. Fensham posits that “a choreographic archive cannot exist without relations between movement images to things, places and practices comprised of novel interactions between the virtual [as digital] and the material” (Fensham, 2013: 159). Similarly, just as an archive composed of material artifacts shapes curatorial practices through the logic of selection and organization, digital objects demand a new interpretative framework for understanding dance as movement within digital archives. It is not merely the transformation of the object that challenges the archive as a concept of place; the digital image itself disrupts the foundational practice of archiving.

Attention in the Camera and the Film – Movement and Frame of Reference

For viewers lacking a background in dance or performing arts, these images do not prompt a visceral need to move their own bodies or mimic the intricate patterns with their limbs.7 The viewing experience does not translate into an embodied understanding of the movements, nor does it activate any muscle memory. Instead, the experience of watching these sequences remains that of an external observer—someone who was not present and has no possibility of comprehending first hand the embodied sensations of those movements. Nevertheless, the movement is captivating; the imagined gamelan music that plays in the mind can evoke musical memories, especially for those who have experienced live performances in Bali. Thus, the Legong dance is encountered as an embodied reality through its representation.

This act of viewing is indeed a deeply real experience, albeit painful—not in Barthes’ sense of loss and death, although the awareness that the individuals depicted are no longer alive is present. There is, however, no personal sting or poignant moment that pricks the viewer with this awareness. It is also challenging to perceive a “blind field,” where the moving images produce a continuity or illusion of life.8 The recognition that the dancers are likely deceased is not the source of the discomfort. Instead, the painstaking process of observing the sequences, frame by frame, back and forth, becomes the true locus of this discomfort. This pain is genuine and rooted in the labour of observation.

The intense, embodied experience arises from the excruciating repetition—the constant rewatching—and from the perception of countless elements within the frame that function as distractions. This process is painful because it demands unwavering, sustained attention, in a process where noticing the intricate details of the image and the physical traces left on the image substrate becomes unavoidable. This transforms into an automated observational routine, a solitary endeavour of pattern recognition and speculation. Despite this, there seems to be an inability to look through the surface of the image. In Barthes’ terms, this mode of looking engulfs the observer in a state of “pensiveness” provided by the static frame, as opposed to perceiving them as fluid moving-image sequences. These details simultaneously conflict with the primary focus of interest; they distract from fully concentrating on the dancer’s movements.

The experience of watching these films is deeply real because it elicits a phenomenal response—through electromagnetic vibrations transformed into visual signals—followed by an emotional reaction. The pain stems not from the content or the realization of loss and death implied by the absence of the dancer here and now, nor from a nostalgic longing for her presence. Instead, the effort to shift the affective response from content to form necessitates focus, requiring alertness and full awareness, discouraging any tendency to take for granted the agency inherent in the elements that constitute the image as a material entity. The films are experienced as real because, in the act of viewing, the observer is engulfed by what the images demand (Figure 2).

Figure 2
Figure 2

A dark dog moves from the top right of the frame to the left, where it joins a white dog, and together they exit at the top left, disappearing into the distance. When viewed frame by frame, these small details draw me into a state of ‘pensiveness,’ momentarily diverting my focus from the dancer’s movements. HD mp4 access file. Frame 3510, ARC_DEZ_FILM_90_496.mp4 Subclip-002A.

Consider shot 004, which lasts for 11 seconds (2:45–2:56). In this sequence (Figure 3, top), the camera acts as the frame of reference, enabling recognition of various interactions among elements. For instance, the dancer’s motion occurs relative to the camera’s horizontal panning, while the static figures in the background appear to shift in response to her downward movement. Even the massive stone seems to move in relation to the camera, and the positioning of the dog relative to the dancer provides a sense of spatial scale. As the camera tilts downward to follow the dancer, the frame of reference shifts to the massive stone, which remains static while other elements continue their movement.

Figure 3
Figure 3

Top, sequence three, shot one in full length. Centre, horizontal camera stabilization, first part of the shot. Bottom, vertical camera stabilization, second part of the shot. JPEG images from the HD mp4 access file. Sequence three, shot one, frames 3408–3654, 247 frames, ARC_DEZ_FILM_90_496, shot 004.

Thus, everything within this scene is in motion—not just the dancer’s body. This assemblage of interacting elements suggests that the dancer’s movements are a result of her engagement with the surrounding environment, rather than existing independently of them. This perspective aligns with de Zoete’s belief, articulated in her writings on Balinese dance, that dance does not happen in isolation. As she notes: “The direction taken by the dance or by the procession is not an arbitrary one… the direction of the offering-processions and dances induces a cosmic harmony. [The dancer’s] whirling … reproduced and incarnated the movements of the stars” (De Zoete, n.d.: 36). The perception of dance is thus embedded in a broader reciprocal interplay—one that operates at both the cosmic scale of “the movement of the stars” and the immediate scale of bodies in motion and at rest around the dancer.9 The movement of the environment is integral to perceiving the dancer’s motion.

If we shift our focus from the camera itself to de Zoete, the operator holding the camera, the ground on which she stands becomes the frame of reference. By adopting this grounded perspective, the camera’s pan and tilt shakes—manifesting as instability along the x and y axes—can be eliminated (Figure 3, centre and bottom). This adjustment yields two significant consequences: first, the integrity of the film’s frame as a consistent reference point disappears, resulting in parts of the image being lost. Second, the dancer appears more stable—she seems to “sit” firmly on the ground—and her movements, though dynamic, seem smoother and less frenetic. Her limb accents become more pronounced in this steadier frame of reference.

By sharing the ground with de Zoete, the observer aligns with the dancer’s body. The details of her movements—the tilting head, twisting arms, shifting hips—demand concentrated attention, while earlier distractions, such as the dog and children in the background, become less intrusive. The dancer’s circular and elliptical limb movements appear as spiralling loops, whose directions shift with changes in the frame of reference. What initially appeared as circular may now appear linear, completely transforming the perception and representation of the movement.

Attention, first through observation and subsequently through repetition, defines the rehearsal structure, which remains consistent with de Zoete’s film documentation even after nearly a century. The following description is based on first hand observations of two rehearsals held at the cultural centre LKB Saraswati on July 8th, 2023, at Taman Ismail Marzuki in Central Jakarta, and at the Hindu temple Pura Agung Tirta Bhuana in Bekasi, Indonesia. The rehearsals adhere to a rigid protocol: music precedes movement. This segmented rehearsal structure arises from the imperative to familiarize the dancers with the music and rhythm before engaging in the dance itself. Rehearsals take place in an open area, partially shaded from the tropical sun, where female students—primarily young girls—arrive gradually. Upon arrival, students gather near the gamelan instruments and gongs alongside their elderly male teacher, I Gusti Kompyang Raka.10 The metallic, piercing sounds produced by small mallets striking the gamelan saturate the space. After approximately fifteen minutes, the music ceases abruptly, and the students transition to a nearby open area. In groups of two or three, led by a female instructor, they execute movements slowly and meditatively, without sound. This phase is periodically interrupted for corrections, repetitions, and instructor demonstrations that the students imitate, emphasizing a precise bodily mimicry (‘Lembaga Kesenian Bali SARASWATI (@lkbsaraswati) • Instagram photos and videos’, n.d.). The final rehearsal phase integrates music with movement, although the music is not performed live—it emanates from a recording played through speakers from a teacher’s phone.

“Dancing… is done almost entirely by imitation, the pupil dancing behind an older dancer who has become a teacher, as well as behind or in front of her guru. It seems almost impossible that such intricacy of dance movements and accents should ever be memorized, but it is astonishing to see with what rapidity they feel their way into the long series of complicated movements” (De Zoete, 1938: 29). The corresponding images in de Zoete’s book, along with footage of rehearsals captured in her films, serve as visual evidence of the observational methodology employed by de Zoete and Spies during their fieldwork in Bali. One film from de Zoete’s archive features djoged leko, a dance beginning with a version of Legong.11 This image was taken by Spies, who briefly appears in the film, holding a Rolleiflex camera as he moves among dancers and teachers.12 Another instance captures a Legong rehearsal at Ida Bagoes’ workshop. De Zoete describes the dancers’ learning process: “their small bodies, elastic in every joint and muscle, waved and fluttered with extreme velocity” (De Zoete, 1938: 32). She comments on the significance of attention and memorization in learning the dance, distinguishing students based on their aptitude and engagement. “One was obviously a born dancer and memorized with great rapidity; another was an idle creature who had memorized nothing and only followed where she was led, … (t)he third, though she worked hard and had a great desire to dance, was pronounced no good” (De Zoete, 1938: 32).

Philosopher Manuel De Landa posits that “voluntary attention is both selective… and… limited” (De Landa, 2022: 67). Why, then, is paying attention so acutely painful for the observer of the dance? There are two approaches to address this discomfort, each corresponding to the experiences of the rehearsal and the films. In the first scenario, the discomfort is embodied through the effects of tropical heat, physical endurance, thirst, or the overwhelming volume of the gamelan orchestra. Although attending the rehearsal was not driven by a desire to dance, the perceptual process mirrors the earlier described scenario. As an observer, attention is directed towards everything occurring—whether actual dancers or projected images. The frames in de Zoete’s films exist as temporal markers experienced “here and now” in real time, persisting in the present. Paying attention necessitates continuous scrutiny of each element perceived in the film at every given frame. The discomfort thus stems from the perceptual act itself.

The dancers’ actions, as captured in de Zoete’s films, impose a bodily toll due to the heightened level of attention demanded, leading to an experience teetering on the edge of physical and perceptual liminality. Attention becomes a scarce resource, requiring a heightened focus that exacts a physiological cost, evident in the increased energy expenditure leading to fatigue and exhaustion. “To be truly conscious of something, we must pay attention to it” (De Landa, 2022: 67). This is the cost of presence, the price for “the experience of being there in real time” (Fensham, 2021: 149).

Attention in the Neural Network – Facial Recognition in Legong Dance

Unlike human beings, machines do not experience constraints such as pain associated with the attentional process. However, machines do exhibit a form of attention. In Convolutional Neural Networks (CNNs), this attentional mechanism is facilitated through Optical Flow. When combined with Recurrent Neural Networks (RNNs), which mimic reverberations in brain activity through circulating loops of information, the displacement of pixels between frames becomes another form of reverberation. The prediction of pixel translation—determined by their luminosity values across spatial points—constitutes a distinct rhythm. It is through this recognition of rhythm, established by predicting sequences, that machines effectively attend to and perceive movement.13

De Landa posits that “organisms tend to perceive affordances, not properties” (De Landa, 2022: 45). Aligning the constituent properties of the dancer’s body—expressed as numerical data—with those of the machine is a necessary step for facilitating conjunction. This alignment allows the observer, the machine, to perceive the rhythm of the dancer’s body through the translation of pixels. For this rhythm to be effectively perceived, the machine must be endowed with the capacity to pay attention, detecting elements that signal meaningful differences.

Optical Flow leverages natural signs, expressed as numerical values for the machine, harnessing these affordances to facilitate the tracking of the dancer’s face across de Zoete’s shots. As De Landa articulates, “One task that requires paying attention is tracking the same object as it moves in the visual field” (De Landa, 2022: 84–85). This continuous identification is achieved through the description of the object’s properties. In both Legong rehearsals and in Optical Flow, attention is inherently intentional. The dance students, the human observer, and the machine orient themselves “toward something in their experience” (De Landa, 2022: 67). The initial pattern of activation—inscribed as an index of the iconic sign—is the affordance that enables Optical Flow to complete its task of tracking the rhythm and sequence of the object’s translation.

Facial mapping in this paper relies on the MediaPipe Face Landmarker pipeline. This process employs a series of models to predict facial landmarks.14 The first CNN model is a face detection network, which returns facial bounding box coordinates, including six approximate facial key points: left and right eyes, nose tip, mouth, left eye tragion, and right eye tragion (‘Face landmark detection guide | MediaPipe’, 2023). The second network, also a CNN, performs a detailed mapping of the face, outputting 468 three-dimensional facial landmarks.15

The sequential nature of these models—face detection, landmark identification, and facial feature classification—involves three distinct CNNs operating in a cascading manner. The first network identifies the presence of a face in the image, while the second perceives and locates facial landmarks. The third estimates facial expressions. Despite these operations, the patterns of activation generated do not yet correspond to the unique expressions characteristic of Legong dance. The machine can track the presence and position of a face, but it has not yet developed the capacity to fully recognize and differentiate Legong facial expressions. To achieve this, an additional model must be concatenated after the initial three, trained specifically with a dataset of Legong facial expressions. This new model is essential for activating patterns that align with those observed in de Zoete’s films, thereby enhancing the machine’s ability to recognize and interpret Legong-specific gestures.

The initial challenge of recognizing Legong expressions highlights a broader limitation of machines: while machines can observe boundaries—such as distinguishing a face from a background—they need additional layers of training to derive meaning from these observations. Recognition begins as a nonconceptual process that becomes meaningful only through the activation and learning processes of the network. The CNN must be trained to differentiate iconic signs into specific semantic classes. Recognition, in this sense, happens through a comparison between iconic signs and a pre-trained dataset capable of identifying and classifying these signs.

Thus, an additional model is needed, trained with a dataset of Legong facial expressions, to activate specific facial features and expressions detected in de Zoete’s films. The degree of alignment between the trained dataset and the activation patterns derived from de Zoete’s dataset determines the probability with which the machine ‘sees, recognizes, learns, and remembers’ Legong facial gestures and movements.

Figure 4 illustrates the estimated probabilities for different gesture classes of head and facial Legong movements after the training process. Through learned distribution patterns, the machine recognizes these movements in de Zoete’s film and uses them to construct classification models specific to neck and facial gestures, effectively generating memory implants that foster the conceptualization of movement.16 This conceptualization raises broader questions about subjectivity and perception in the context of machine learning: if subjectivity involves fitting data into the preferred state of a network—patterns of thought representing beliefs or discourses—then what does it mean for a machine to conceptualize movement?

Figure 4
Figure 4

Top: frames 4385–4626, ARC_DEZ_FILM_90_496, DPX original scans. Shot 005A zoomed in to focus on the head and face, with colour correction applied to enhance the shot and display estimated probabilities for each gesture class. Bottom: Shot 004, screenshot displaying probability and class estimation.

De Landa’s discussion of the semantic nature of “natural” signs presents an interesting challenge to our understanding of machine perception. The convolutional process in CNNs reveals an ongoing dichotomy: the field of artificial intelligence, which seeks to emulate human cognitive processes, continues to provide valuable insights into the nature of human intelligence and consciousness. As Feynman succinctly puts it, “the field of ‘artificial intelligence’… might have a lot to say about the nature of ‘real’ intelligence, and mind” (Feynman, 2018: xiii).

The use of linguistic labels inherently introduces biases, as the meanings of words are contingent upon temporal, cultural, and contextual shifts, thereby lacking fixed definitions. Categories evolve continuously and resist static classification. This fluidity is exemplified in the variations found within Legong dance, where distinct movements and facial expressions may be identified and labelled differently. Such distinctions are often influenced by geographical variations or dance school traditions, each with its distinct Balinese dialect. As a result, scholars and practitioners of Legong might articulate different terminologies for the same movement, even when its visual and perceptual attributes remain consistent.

Conversely, pixel values are not inherently subject to such biases, as visual perception by machines largely operates in a non-conceptual manner. The strength of this mapping methodology lies in leveraging international and universal technologies to document culturally specific and tangible elements. This approach embodies an augmentation logic that is shaped by human intervention rather than an automation logic driven solely by algorithmic processes. Augmentation necessitates critical evaluation, encompassing tasks such as auditing, verification, calibration, interpretation, and directed attention. This stands in contrast to automation, which generally emphasizes automated data generation rather than the manual synthesis and interpretative rigor essential to augmentation.

In the process of cultural augmentation, a series of video excerpts was utilized to construct a dataset, which was formatted as a CSV file comprising Legong facial expressions and gestures to support the training process.17 This dataset encompassed four distinct facial expressions and one specific neck movement: Ngileg, characterized by a wiggling motion of the neck from side to side; Nyedelet kanan, involving a pronounced sideways glance of the eyes to the right, followed by a swift return of the head and eyes to a forward position; Nyedelet kiri, similar to Nyedelet kanan but directed to the left; Senyum, a closed-mouth smile without visible teeth; Nelik, an intense expression featuring wide-open eyes with a direct gaze and a closed mouth; Ngelier kanan, entailing a blink of the right eye while the left remains open, along with a slow neck movement from the right to a forward-facing position; and Ngelier kiri, which mirrors Ngelier kanan but involves the left eye blinking and a corresponding neck movement from left to straight. While Nyedelet kanan and Nyedelet kiri were integrated in earlier versions of the pose estimation tool, they were eventually excluded from the training dataset used in the final prototype.

Conclusion

This approach advances augmentation by developing tools for visualizing and comparing stylistic evolution across archival collections, contributing to discourses on mediation, historiography, and cultural materiality. This approach establishes protocols for representing Legong within the Balinese cultural context, balancing open access with cultural sensitivity in digital heritage.

The distinction between automation and augmentation is crucial. This methodology demands a sophisticated understanding of algorithmic pixel processing to represent bodies in motion, extending beyond basic automation. Analysing film archives through movement analysis allows technologies to scrutinize motion at a microscopic level. This interplay between embodied film representation and machine interpretation represents a non-human interaction. However, human intervention remains essential in this process, ensuring that our contributions to augmentation are both meaningful and contextually informed.

Attention to algorithmic movement processing is also political, potentially imposing control over Balinese dancers’ representation. The objective is to enhance the diversity of dance movement archives through advanced motion analysis. Contrary to the assumption that technology homogenizes, image-based mapping can enrich archives by recognizing each dancer’s unique agency.

Notes

  1. My interest in Legong and the technologies for recording movement stems from my eight years of living and working within an academic environment in Indonesia. During 2017–2018, I was part of a team tasked with establishing a Motion Capture Studio as part of an initiative to develop teaching and research facilities. This experience enabled me to explore the potential of motion capture technologies for the study of dance, including Legong. Combined with my background in photography and media, as well as a deep interest in the historical colonial archive, this foundation informed the initial questions that shaped my research. [^]
  2. See (Davies, 2008: 206). See also (Vickers, 2009: 6). [^]
  3. (Taylor, 2019: 40). Emphasis in the original. Translation from Spanish is mine. [^]
  4. Walter Benjamin argues: “Even the most perfect reproduction of a work of art is lacking in one element: its presence in time and space, its unique existence at the place where it happens to be. … The presence of the original is the prerequisite to the concept of authenticity” (Benjamin, 1986: 220). [^]
  5. Mentioned at a lecture given at the Centro de Documentación Arkheia-MUAC and the Centro de Documentación del Museo Ex Teresa Arte Actual in September 2017, Museo Universitario Arte Contemporáneo, Ciudad de Mexico. (Diana Taylor reflexiona sobre el lugar de los archivos en línea, 2018). Translation from Spanish is mine. [^]
  6. (Taylor, 2019: 40). Translation from Spanish is mine. [^]
  7. De Zoete was acutely aware of the act of mediation, as evidenced in her written records. She stated: “The aim of a writer about the dances of an alien civilization, which few of his readers will ever see, must be to make them, and the background against which they are performed, as living as possible. For people do not dance in a vacuum; they form part of a natural and social environment; they have traditions, often very mixed and complex, which are reflected in their dance” (De Zoete, 1984: 13). [^]
  8. Barthes introduces the concept of blind field to oppose the individual photographic frame, still, inanimate, dead, to the projected–alive–moving image of cinema on the screen (Barthes, 1981: 57). [^]
  9. See (Nadler, 2006: 139). [^]
  10. See (‘I Gusti Kompyang Raka’, 2022). [^]
  11. See (De Zoete, 1938: 242). [^]
  12. Spies owned a medium format Rolleiflex and a 35 mm Leica Model III. See (Hitchcock, Norris, Spies, and De Zoete, 1995: 69–70). [^]
  13. Neural networks that use the Transformers Architecture operate differently. These networks are used primarily for natural language processing (NLP), like ChatGPT (Vaswani et al., 2023). [^]
  14. Face detector, BlazeFace short range model, SSD-like network. See (‘CV4ARVR 2019 Papers’, n.d.) See also (Bazarevsky et al., 2020). [^]
  15. FaceMesh-V2 model, MobileNetV2-like network. See (Sandler, Howard, Zhu, Zhmoginov, and Chen, 2019). [^]
  16. In relation to CNNs capabilities for representational recognition of objects, cognitive science researcher Justin Wood argues: “Ultimately, we anticipate that a machine with the same learning mechanisms (brain and body) and training data (environment) as newborn chicks should pass this newborn embodied Turing test, developing the same visual preferences and object recognition abilities as newborn chicks” (Wood, Pak, Lee, and Wood, 2023). [^]
  17. The csv file contains data as follows, Ngelier-Kanan: 836 rows; Ngelier-Kiri: 626 rows; Nelik: 902 rows; Senyum: 263 rows; Ngileg: 192 rows; with coordinates x, y, and z distributed in 468 columns, mapping each of the 468 facial landmarks. The data was created using online video snippets from three different sources (Institut Seni Indonesia Denpasar, Peliatan Dance Community, and Himpunan Mahasiswa Jurusan Pendidikan Seni Pertunjukan Institut Seni Indonesia Denpasar), including three dancers (Bintang Laksmi, Oming Sri, and one anonymous dancer led by Prof. Kadek Diah Permanasari). See (Video Tutorial Teknik Dasar Tari Bali (part 4), 2022), (Gerakan dasar tari Bali (Bagian 1). Movements in Balinese traditional dances (Part 1)., 2020), (Tutorial Dasar Gerak Tari Bali (Putri) – Bintang Laksmi #hindaricovid19, 2020). [^]

Competing Interests

The author has no competing interests to declare.

Author info

Rodrigo Gonzalo Encinar, born in Spain, is a researcher, visual artist, and designer. His current research focuses on analytical explorations of representation, particularly the latent possibilities of photographic and moving image archives in the digital age for the study of movement.

He holds a BA in Design from TH Georg Simon Ohm University of Applied Sciences, Nuremberg, Germany; an MFA in Photography and Related Media from the Rochester Institute of Technology, New York, USA; and a PhD in Screen Media, Film Digital Heritage, and Dance Archives from the University of Melbourne, Australia.

At the University of Melbourne, he served as a researcher at the Digital Studio within the Faculty of Arts. Over the past 15 years, he has exhibited his work internationally while building a teaching career in Indonesia and Singapore.

References

Barthes, Roland. 1981. Camera Lucida: Reflections on Photography. (Richard Howard, Tran.). Farrar, Straus and Giroux.

Bazarevsky, Valentin, Grishchenko, Ivan, Raveendran, Karthik, Zhu, Tyler, Zhang, Fan, and Grundmann, Matthias. 2020, June 17. BlazePose: On-device Real-time Body Pose tracking. arXiv. doi:  http://doi.org/10.48550/arXiv.2006.10204

Benjamin, Walter. 1986. Illuminations. (Harry Zohn, Tran.). New York: Schocken Books.

CV4ARVR 2019 Papers. (n.d.). XR @ Cornell. Accessed 16th September 2023 from: https://xr.cornell.edu/workshop/2019/papers

Davies, Stephen. 2008. The origins of Balinese legong. Bijdragen Tot de Taal-, Land- En Volkenkunde, 164(2/3), 194–211.

De Landa, Manuel. 2022. Materialist phenomenology: a philosophy of perception. London: Bloomsbury academic.

De Zoete, Beryl. 1938. Dance and drama in Bali. London: Faber and Faber limited.

De Zoete, Beryl. 1984. The other mind: a study of dance in South India. Ann Arbor, Mich.: University Microfilms International.

De Zoete, Beryl. (n.d.). Typescript articles and reviews including ‘Plea for Dance Study as a Branch of Anthropology.’ Rutgers University Libraries. Special Collections and University Archives. Accessed from: https://archives.libraries.rutgers.edu/repositories/11/archival_objects/150999

Diana Taylor reflexiona sobre el lugar de los archivos en línea. 2018. Accessed from: https://www.youtube.com/watch?v=FOvNRJomzeE

Face landmark detection guide | MediaPipe. 2023, May 9. Google Developers. Accessed 12th May 2023 from: https://developers.google.com/mediapipe/solutions/vision/face_landmarker

Fensham, Rachel. 2013. Choreographic Archives: Towards an Ontology of Movement Images. In Gunhild Borggreen, Rune Gade, and Heike Roms (Eds.), Performing archives – archives of performance (pp. 146–162). Copenhagen: Museum Tusculanum Press.

Fensham, Rachel. 2021. Movement. London; New York: Methuen Drama.

Feynman, Richard. 2018. Feynman lectures on computation. (Anthony J. G. Hey, Ed.) (paperback print., [repr.].). Cambridge, Mass: Perseus.

Films: Beryl de Zoete Collection. (n.d.). Horniman Museum and Gardens. Accessed 12th December 2021 from: https://www.horniman.ac.uk/object/ARC/DEZ/FILM/

Gerakan dasar tari Bali (Bagian 1). Movements in Balinese traditional dances (Part 1). 2020. Accessed from: https://www.youtube.com/watch?v=ENyztjD-NCI&t=362s

Hitchcock, Michael. 1991. Dance and drama in Bali: The photographs of Beryl de Zoete and Walter Spies. Indonesia Circle. School of Oriental and African Studies. Newsletter, 20(56), 49–55. doi: http://doi.org/10.1080/03062849108729770

Hitchcock, Michael, Norris, Lucy, Spies, Walter, and De Zoete, Beryl. 1995. Bali, the imaginary museum: the photographs of Walter Spies and Beryl de Zoete. Kuala Lumpur; New York: Oxford University Press.

I Gusti Kompyang Raka. 2022, December 19. In Wikipedia bahasa Indonesia, ensiklopedia bebas. Accessed from: https://id.wikipedia.org/w/index.php?title=I_Gusti_Kompyang_Raka&oldid=22370159

Kurkdjian, Onnes. 1900, c. Bali. Rijksmuseum. Accessed 2nd October 2022 from: https://www.rijksmuseum.nl/en/collection/RP-F-2001-17-28

Lembaga Kesenian Bali SARASWATI (@lkbsaraswati) • Instagram photos and videos. (n.d.). Accessed 31st August 2023 from: https://www.instagram.com/p/Cfkugknv7Jq/

Nadler, Steven M. 2006. Spinoza’s Ethics: an introduction. New York: Cambridge University Press.

Sandler, Mark, Howard, Andrew, Zhu, Menglong, Zhmoginov, Andrey, and Chen, Liang-Chieh. 2019, March 21. MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv. doi: http://doi.org/10.48550/arXiv.1801.04381

Taylor, Diana. 2019. Archivos Digitales. In Sol Henaro, Sofía Carrillo Herrerías, Instituto Nacional de Bellas Artes (Mexico), and Universidad Nacional Autónoma de México (Eds.), Archivos fuera de lugar: desbordes discursivos, expositivos y autorales del documento (pp. 39–46). Ciudad de México: Taller de Ediciones Económicas.

Tutorial Dasar Gerak Tari Bali (Putri) – Bintang Laksmi #hindaricovid19. 2020. Accessed from: https://www.youtube.com/watch?v=OerG336Lhac

Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., … Polosukhin, Illia. 2023. Attention Is All You Need. arXiv. doi:  http://doi.org/10.48550/arXiv.1706.03762

Vickers, Adrian. 2009. When did ‘legong’ start? A reply to Stephen Davies. Bijdragen Tot de Taal-, Land- En Volkenkunde, 165(1), 1–7.

Video Tutorial Teknik Dasar Tari Bali (part 4). 2022. Accessed from: https://www.youtube.com/watch?v=sVOMcZvQSZQ

Wood, Justin N., Pak, Denizhan, Lee, Donsuk, and Wood, Samantha M. W. 2023, June 8. A newborn embodied Turing test for view-invariant object recognition. arXiv. doi: http://doi.org/10.48550/arXiv.2306.05582