Casting Silence in Final Fantasy

“What was your first Final Fantasy game?” for some players is alike to the question “Who was your first Doctor?” for the Doctor Who franchise. My choices are Final Fantasy VII and the Ninth Doctor, respectively (although technically my first Doctor could also be the Tenth, because of Space channel re-runs, but my first season completed was the Ninth). Final Fantasy VII taught my family and friends a lot, including but not limited to storytelling in video games, environmental pollution, corrupt politics, class stratification, and many more heady subjects both societal and personal. As a coincidental aside, my family, friends, and I took turns playing my copy of FFVII during one of the longer illegal teachers’ strikes during the Mike Harris era in Toronto. The profit-driven politics of Shinra’s power company echoed Harris’ platform distantly, and I remember Toronto feeling like it resembled Midgar, with its subways, skyscrapers, and the sheer size of it from my point of view as a child.


I wasn’t necessarily aware of how these themes and messages were influencing me at the time, but two decades later I can safely say that Final Fantasy VII is a large part of why I enjoy analyzing and commenting on story-driven video games. The FF you start with inevitably sets up expectations and colours your impressions of the series other entries. In my case it set me up to expect a multi-layered story, idiosyncratic character design, and exciting (yet sometimes surreal) FMV cutscenes. For many people, Final Fantasy VII was one of the first games that made them connect in a deeply emotional way to a medium other than film or theatre. Think of Aeris’ death, and the resulting shockwave it sent through early internet game forums and criticism.

In spite of all Final Fantasy VII had to teach me about video games as storytelling media, however, other entries in the series had perhaps even more lessons to impart. Particularly regarding the parallels between the development of video games as a story-driven medium and film as a related medium. Final Fantasy VI, VIII, IX, and X are key examples of how video games can tell compelling stories without voice-acted cutscenes.


Nowadays voice acting has become something of a requirement in video games, at least at the AAA level of production. Indie or niche AAA titles will often (whether due to stylization or limited budgets) forego voice acting or opt for minimal vocalization to evoke a specific mood, some examples including Fumita Ueda’s Ico and Shadow of the Colossus, Squaresoft’s Chrono Cross, Thatgamecompany’s Journey, Tequila Works’ RiME or Stoic Studio’s Banner Saga. But back in the 1990s and earlier, the voice acting was more rare, especially regarding JRPGs. Sometimes there were brief instances of voice-acting (as in Xenogears) but they were often not memorable or downright wooden in their dubbed performances. But the Final Fantasy series, especially around the late 90s to early ‘00s, exhibited a certain grace in its direction of silent cutscenes. Combining creative montages featuring vibrant character gestures, exaggerated facial expressions, and dialogue bubbles outside of cutscenes that contextualize situations in a similar manner to title cards, one could say entries like Final Fantasy VII and IX showcased video games’ equivalent of cinema’s silent films. The move from these entries to Final Fantasy X then, the first “talky” of the series, showcases the growing pains of video games as they negotiate whether it’s worthwhile to continue relying on cinematic techniques in an interactive storytelling medium.

Scholar Janet H. Murray, who we can thank for her hard work arguing for interactive media’s storytelling potential in her book Hamlet on the Holodeck, noted that video games’ cinematic aspirations “are a sign that the medium is in an early stage of development and is still depending on formats derived from earlier technologies instead of exploiting its own expressive power.” This early stage of development Murray coined as “incunabular”, referencing the early history of book design during the advent of the Gutenberg press, and the intense period of 50 years experimentation it took to establish today’s standardized printing techniques and book formatting. Film went through this incunabular stage as well, when filmmakers were still realizing all the different ways that could exploit a camera’s unique properties (movement, lenses, celluloid film, cutting and rearranging film, etc.), which resulted in what are now labeled “photoplays.” Photoplays, like the video games of the ‘90s and early ‘00s, are an additive art form combining photography with theatre. The camera was mostly static, and there wasn’t a whole lot of movement, sound, or editing. Only with D.W. Griffith’s financial success with The Birth of a Nation in 1915, did film actively strive to invent many new techniques to advance the complexity of cinematography. Eventually, around the golden age of silent film (the 1920s), film had learned a language or grammar of unique techniques and conventions. But it took around 3 decades to achieve successful filmic grammar. With only 2 decades or so under its belt in the 90s to early ‘00s, video games were still incunabular because they hadn’t quite established their own grammar of techniques and conventions unique to their medium. Only now in 2018 that we’ve begun to see more examples of games relying more heavily and consciously on the unique properties of its system, user interface, and interaction to communicate narrative.


Final Fantasy from a relatively early point in its series, however, embraced the experimental process of working within the limits of its medium to express a dramatic and even epic atmosphere to its narrative. When I first played Final Fantasy VI (released also as III for its initial North American localization) on an iMac emulator in high school, my sister (who often tag-teams playing through games we’re both into) and I were shocked at how evocative the opening sequence with Terra and her attendant Magitek Soldiers Vicks and Wedge marching to Narshe was. Despite the 8-bit pixel art and the often mouthless, noseless faces of the diminutive sprites representing the characters, the slow pan of the camera capturing the journey to Narshe over snow-swept plains possessed the same sense of gravity that later FF game FMVs did. FFVI was a departure from other childish or lighthearted games of its era.

FFVI was the first of the series to feature detailed sprites that (for its time) exhibited a variety of movements and facial expressions. The game also extensively used Mode 7 graphics as well, which lent an almost 3-D quality to the world map. If we compare FFVI graphical experimentation to the additive art of photoplays, we can find a strange sort of parallel of early filmic storytelling and early storytelling in games. Games like FFVI can be seen as additive art in its form as well: it was a video game, which back then was rarely more than an Arcade-like experience plus something akin to cinematic storytelling. One could speculate some of the most memorable scenes/stages at the opera house were not just about Squaresoft showing off what they could do with the SNES system but also showing how games could start to push the envelope with interactive storytelling.

So, when you consider that FFVIII and IX came out only a handful of years after FFVI, these entries are somewhat staggering in their progress. I leave out FFVII from this discussion not because it doesn’t show major progress in its representational storytelling, but because unlike FFVIII and IX, the graphics didn’t achieve the same kind of convincing facial expressions or body language. The character models for FFVII were rendered just awkwardly enough that, while the super-deformed in-game models were endearing and the full-body models reminiscent of Nomura’s classic concept art, the expressiveness remains minimalist with a few exceptions. Most of FFVII’s cutscenes are more focused on rendered environments as well, with only a key few featuring the stiff movements and mostly static expressions of the character models.


FFVIII was the first game in the series to feature full-body character models in-game, as a result of experimentation in other Squaresoft titles of the era like Parasite Eve and perhaps Vagrant Story. According to an F.A.Q. on the FFVIII promo site (which was apparently still active in 2015), a team of roughly 35 people worked on VIII’s FMV sequences, utilizing both motion capture technology and manual animation to achieve more realistic character movement. Consequently, VIII’s gameplay was more closely tied to its cutscenes, making the experience feel more seamless than VII. The same goes for IX, which also experimented with “Active Time Event” scenes, which allowed players to view other characters’ actions that are roughly concurrent with the main party’s timeline.

If FFVI was alike to a photoplay, FFVIII and IX were then akin to the silent films of the twenties. Both VIII and IX rely strongly on facial expression, semi-realistic gestures and movements that were easy to read, and dialogue bubbles that functioned as the video game version of title cards. The last point is important to note, as the mileage varies for which cutscenes in VIII and IX are easy to interpret without the aid of dialogue sequences and lore discovered during explorative gameplay. Scenes like “The Landing at Dollet” or the famous “Ballroom” scene in VIII convey simple enough narratives like a war mission launch or boy-meets-girl, but is it possible to fully understand any of the “Lunar Cry” sequences without the aid of the dialogue-filled gameplay sequences that happen in-between? Furthermore, in IX there are general thread throughs in its cutscenes such as uncovering Princess Garnet’s true origins as a summoner’s daughter or being on the lam from Alexandria’s forces, but both are read nebulously without the contextualization of gameplay sequences including the detailed ATEs.

Silent films have a parallel during their Golden Age, especially with German Expressionist titles like Metropolis and The Cabinet of Dr. Caligari. While there are definitely more scenes in these films that are straightforward in nature, some scenes are abstract enough that without the title cards (which in of themselves also help to provide atmosphere through their unique typography) the context suffers. The difference between silent films and FF games however, is that by the time of German Expressionist movement, films had already been actively seeking to improve their storytelling techniques for over two decades, while games in the FF series had only really begun to push the medium as a storytelling one in earnest for less than decade.

Yet there are moments of brilliance in VIII’s cinematography in particular, like the opening sequence full of surreal imagery of vast idyllic flowered fields and stark deserts. The juxtaposed text of Squall’s promise to Rinoa that he’ll be waiting for her, with the flash forwards to fraught events involving the sorceress Ultimecia/Edea that the couple will have to endure together, and the fateful battle with Seifer that resulted in both his and Squall’s scarring demonstrate thoughtful foregrounding. The way the opening handles its symbolism is echoed in other FMVs as well, like the reflection of the moon in the water before Dollet, being crossed by Seed ships that are heading into conflict. As the player will come to know later in the plot, the moon is the origin of the chaotic Lunar Cry events and where the tyrant sorceress Adel is confined in a high-tech tomb. Upon a second viewing, the reflection scene comments on how the conflicts Seed and Balamb Garden are embroiled in during the present era are the result of Adel’s past machinations for world domination. Considering that FFVIII’s story features a climax that involves a time loop, one that spurred the well-known and debated Squall is Dead theory, it’s only fitting that motifs of reflection and causation recur throughout the game.


A final parallel between silent film and pre-Y2K to early Y2K FF games is a shared anxiety around the shift to vocalized actors or characters. When FFX was in development (impressively in 1999, the same year FFVIII was shipped), executive producer Hironobu Sakaguchi had concerns about how the addition of voice acting would affect the development of the game (alongside 3-D backgrounds, and real-time storytelling). While motion capture and skeletal animation offered new heights of realism with convincing lip movements, the voice acting proved to be a challenge for localization (both technically and linguistically). If the rhythm and timing of the English voice acting was even slightly off, the game would crash. Alexander O. Smith, who was the localization specialist for FFX, likened the dubbing constraint to writing and directing in haiku. Although this led to some infamously awkward moments, like the stilted laughing scene that launched more than a thousand memes, there were also moments of successful dubbing like Yuna and Seymour’s wedding scene, where the lip-syncing was nearly seamless and added another emotional layer to the smoothly choreographed action. Of all the changes voice acting brought to the table, however, the most interesting one by far was the decrease in Squaresoft’s typical story exposition.

Kazushige Nojima, one of the chief writers for the game’s script commented that because voice acting was such a powerhouse for conveying emotion, he could “keep the storyline simple.” The shift to fully-voice acted games meant a dramatic decrease in textual exposition, although as fellow editor Jen Stienstra has explored, subtitles were and still are important to storytelling and gameplay in terms of accessibility. Just as film all but discarded their title cards with the shift to “talkies”, RPGs no longer relied solely on wordy dialogue boxes. Voice acting became a boon for modern RPGs, especially since titles like the Xenosaga trilogy, Lost Odyssey, and Bioware’s Mass Effect and Dragon Age series boasted impressively long scripts. Not to mention games like the Metal Gear Solid series, which took cues and inspiration from the storied history of film.

Yet even though FFX marks the end of the pre-vocal RPG era, I’d like to conclude by drawing your attention to a particularly poignant sequence that I feel captures the transition from old-school Squaresoft to millennial Square Enix. Yuna’s Sending in Kilika.


Despite the voice acting and lush soundtrack adding a lot of palpable emotion to the scene, you can watch this sequence silent and still have a fairly clear sense of what’s transpiring. And with the subtitles on, if you watch the sending attendant with the brief dialogue between Lulu and Tidus, you get the full picture (pardon the pun). FFX was developed during the year of FFVIII’s release, so despite the jump in production quality, you can see how the team uses past silent-film-like techniques of expression with the latest technical advancements in animation to push the storytelling further.

Just as the experimentation of silent film was vital for developing a grammar of filmic techniques, so was the experimentation we saw from pre-vocal Final Fantasy the same for advancing video games’ grammar. The late nineties to early Y2K marks a significant time for video games as they made one leap closer to discarding their status as an incunabular medium. After voice-acting was introduced, then came the numerous games that strove to be cinematic in a very self-aware manner, such as: Uncharted with its Indiana Jones-style action adventure, later entries to Metal Gear Solid which contained countless intertextual references to movie and TV history, The Last of Us, Spec Ops: The Line (which extended commentary war films like Apocalypse Now started), and David Cage’s games which could be seen as the saturation point of this trend towards making video games almost entirely like a movie. Graphically the games of this era have heavily borrowed from film editing techniques, and filters like those seen in Mass Effect lend a faux film grain look to the gameplay “footage.”


In a recent GDC talk given by Matthew Weise about the end of cinematic games, he notes that what’s needed as game design moves forward is not to reproduce the language of film wholesale but to focus on aspects of film that inspire new interfaces and mechanics in the language of games. When games are too cinematic, they become prescriptive, predictable, players have no agency to explore (both mechanics and game worlds). While the FF titles I’ve discussed were more on the cinematic side of the spectrum, I believe their creative directors understood what it meant to be inspired by filmic techniques instead of simply copying them. Pre-vocal FF shows us how story-driven games can work within and expand the potential of a rapidly growing language of games. And I love them for it.