Because It's Not There: Ekphrasis and the Threat of Graphics in Interactive
Fiction
The genre of interactive fiction has enjoyed increasing critical attention over the past
few years, particularly since the publication of Nick Montfort's
Twisty Little Passages: An Approach to Interactive Fiction.
[1] According to Eric Eve's definition, an
interactive fiction is “a turn-based program driven by textual
input from the player, responding with output that is principally or wholly textual,
and involving a parser and a world model”
[
Eve 2007, para. 1]. In other words, IF is a program that (1) simulates a diegetic world containing
various spaces and objects (the world model), (2) presents that world to the user/player
through the medium of unillustrated or sparsely illustrated text, and (3) permits the user
to interact with its simulated world by inputting textual commands. IF, then, is
distinguished from other genres of video games by its lack of images, and from other forms
of recombinatory or procedural textuality by its inclusion of a world model.
Up to this point, IF has typically been examined from the viewpoint of its textual and
programmatic aspects. For Montfort and others, IF descends from the canonical traditions
of riddle-making and ergodic textuality and participates in the contemporary movement of
electronic literature. According to these claims, the value of IF for scholarly study lies
in what it tells us about textuality, literariness, and the transformations of both in the
digital era. The existing critical discourse presents IF as a primarily
textual,
procedural and
ludic phenomenon — as an
art form or communicative medium which is composed of verbal signifiers that are subject
to rule-based manipulations, and which has historically been used to produce
games.
[2] With rare exceptions, critics have
neglected the visuality of IF. In this paper I will explain the necessity of rectifying
this neglect, and take tentative steps toward doing so.
The distinction between the textual and the visual, or between the verbal and the visual
signifier, is impossible to define precisely, because, as W.J.T. Mitchell argues, such
distinctions are always already political: “Every theoretical
answer to the questions, What is an image? How are images different from words? seemed
inevitably to fall back into prior questions of value and interest that could only be
answered in historical terms”
[
Mitchell 1990, 3]. For the narrow purposes of the present
analysis, we might define a textual signifier as a sign whose visual appearance is not
directly linked to its signifying value. For example, the visual appearance of the letter
P can vary, to a certain predetermined extent, without altering its semantic value.
Similarly, a novel can be set in a variety of typefaces while still being understood as
the “same” text, and a computer program will carry out the same
processes regardless of the font in which it is written. A visual signifier or image, by
contrast, is one whose semantic or affective value is linked directly to its specific
visual appearance, including its material embodiment and/or its phenomenological effect on
the viewer. I will use the term
image to refer interchangeably to
“real” images and mental visualizations. While “pictorial images are inevitably conventional and contaminated by
language”
[
Mitchell 1990, 42], the image at least tries to claim that its
meaning is contingent on its physical appearance. “The
image is the sign that pretends not to be a sign, masquerading as (or for the believer,
actually achieving) natural immediacy and presence”
[
Mitchell 1990, 43]. Obviously, this word-image distinction is as
problematic and open to critique as any such distinction; see, for example, [
Drucker 2002, 154–160] for an argument that something
is
indeed lost when the materiality of a letter is altered. I claim merely that this
distinction represents a commonsensical understanding of what distinguishes words from
images. It reflects the way in which IF critics typically understand these terms when they
don't interrogate them further.
Critics have typically paid little attention to the visuality of IF — which includes both
its use of actual visible signs, and the visual images it may evoke in the player's mind.
This may seem hardly surprising since, by the second element of the above definition, IF
consists mostly or entirely of textual signifiers and makes limited use of images. In the
present paper, however, I will suggest that interacting with IF is in fact a visual
experience in crucially important ways, and that IF therefore has important things to
teach us about the fate of the visual aspects of verbal signifiers in the digital era.
Without denying that IF participates in various traditions of potential literature and
ludic textuality, as Montfort and others suggest, I here want to suggest that IF is also
an heir to equally longstanding traditions of ekphrasis and of visual prose. As such, IF
poses questions of the relation between descriptive text and readerly visualization that
go back as far as Homer's description of the shield of Achilles — though by virtue of its
ergodic nature, IF also significantly transforms those questions. By viewing IF as a
visual-textual phenomenon, we can improve our understanding of the transformation of
visual prose and readerly visuality in the digital era.
Moreover, a focus on the visuality of IF can improve our understanding of how the genre
defines itself. A recurring concern of ekphrastic poetry is the definition of the relation
of poetry to painting and, more recently, to still photography and film. As Mitchell
argues, ekphrasis is the genre in which text (in the narrow sense given above) confronts
its other: “Ekphrastic poetry is the genre in which texts encounter
their own semiotic ‘others,’ those rival, alien modes of
representation called the visual, graphic, plastic, or ‘spatial’
arts”
[
Mitchell 1994, para. 9] This argument expands on James Heffernan's
reading of ekphrasis as paragonal — that is, as enacting a competitive struggle between
word and image. Elizabeth Bergmann Loizeaux suggests, by contrast, that ekphrasis may also
be motivated by “such modest, and profound, feelings as
companionship or friendship, the terms in which poets often describe their ekphrastic
motives”
[
Loizeaux 2008, 15]. Under either model, however, a central
drive behind ekphrasis is the desire to define poetry or “textual” art
itself by contrast to its other. By directly addressing the image, poetry makes claims for
what it can do that the image can't, and/or asks how it can do what images seem capable of
doing more effectively.
This task becomes especially pressing in the present cultural moment. Ekphrastic
literature has perhaps always been both fascinated and repelled by the apparently superior
mimetic power of images to text. As Murray Krieger argues, ekphrasis entails “the defensive concession that language, as arbitrary and with a sensuous
lack, is a disadvantaged medium in need of emulating the natural and sensible medium of
the plastic arts,” which exists in an ambivalent relation to “the prideful confidence in language as a medium privileged by its very
intelligibility”
[
Krieger 1992, 12]. However, the more images advance in both
ubiquity and mimetic power, the more unequal the terms of this relation become. Loizeaux
observes that twentieth-century poets' interest in ekphrasis arises from ambivalent
reactions to the growing cultural importance of the image:
The
widespread presence of ekphrasis in twentieth-century poetry can be understood as both
a response to and a participant in what W.J.T. Mitchell has called “the pictorial
turn” from a culture of words into a culture of images that began in the late
nineteenth century with the advent of photography and then film, and has accelerated
since the mid twentieth century with the invention of television and, now, digital
media. Excited — and haunted — by a sense of images' increasing power in western
culture, poets have taken up ekphrasis as a way of engaging and understanding their
allure and force.
[Loizeaux 2008, 3–4]
At the same time that images have attained
unprecedented cultural power, poetry has now “further lost
popular readership and its significant social role”
[
Loizeaux 2008, 6]. Explicit confrontation with the image now
becomes a way of justifying he continued appeal, if not the very existence, of poetry
itself.
Similarly, IF authors and critics feel a need to distinguish IF from graphical video
games in order to explain why IF should continue to exist today, despite its apparent
commercial and technological inferiority to graphical video games. Graphics both threaten
and fascinate IF authors in much the same way that paintings both threaten and fascinate
poets. IF authors and critics feel a need to distinguish IF from graphical video games in
order to explain why IF should continue to exist today, despite its apparent commercial
and technological inferiority to graphical video games.
As an example of the study of IF from a visual perspective, in this essay I offer
readings of two recent works of IF that represent opposing conceptions of the genre's
visual aspects. My first text, Nick Montfort's
Ad Verbum
(2000), goes further than perhaps any other work of IF in stressing the genre's textual
properties at the expense of its visual properties. In calling attention to the textual
nature of the IF interface and of the player's input,
Ad
Verbum defines itself as a purely verbal artifact. My second text, Emily Short's
City of Secrets (2003), seeks instead to accentuate its own
visuality by providing evocative descriptions accompanied by abstract imagery. Yet the
mode of visuality that this game proposes is affective, evocative and phantasmal, rather
than vivid and immediately present. This game proposes that IF can be a visual experience,
but that its visuality differs in significant ways from that of the graphical video game.
Though these games approach visuality in very different ways, a central question for both
games is whether and how the visual properties of text can compete with those of more
mimetic forms of imagery. This, I would argue, is as crucial a question for interactive
fiction as it is for ekphrastic poetry, because it touches upon the larger question of
what happens to less explicitly transparent forms of visuality and textuality at a time
when transparent forms of visuality seem to have attained a position of cultural
dominance. As I will argue, IF, like ekphrastic poetry, offers visual experiences which
are indirect, phantasmal, and dependent on the player's imagination. How can such visual
experiences compete with the transparent visual experiences offered by media like computer
games and CG film? Do we still want or need such visual experiences, and if so,
why?
[3] The two games
I'll be discussing represent two possible answers to these pressing questions.
Toward a Theory of IF Visuality
For most IF critics, IF is a verbal, textual and literary medium whose closest affinities
are with the tradition of ergodic textuality that extends from the
I
Ching and the Exeter Book, through the Oulipo and Cortázar, to hypertext
fiction. On this assumption, the visual aspects of IF, if any, are usually ignored. Espen
Aarseth, for example, treats IF as “a new type of literary
artifact”
[
Aarseth 1997, 107]. His reading of the Infocom game
Deadline considers only its literary and ludological aspects.
Montfort, the leading authority on the genre, has equally little to say about its visual
qualities. According to the historical narrative he provides, the antecedents of IF are
textual genres, including riddles and Oulipian potential literature [
Montfort 2005, 37, 65]. The major exception to this neglect of IF’s
visuality is Dennis Jerz’s article “Somewhere Nearby is Colossal
Cave,” which compares the geography of Will Crowther and Don Woods’s
Colossal Cave (or
Adventure), usually
considered the first work of interactive fiction, to the geography of the real cave on
which the game was based. In a photo-essay, Jerz juxtaposes Crowther's room descriptions
with photographs of the real-world locations on which those descriptions are based.
However, Jerz's stated goal here is to “establish that Crowther's
original was not only faithful to the geography of the real Colossal Cave, but was
also a fantasy remediation of that site”
[
Jerz 2007, para. 2]. The question that interests Jerz is the
extent to which the simulated cave faithfully reflects the real one. What he leaves
unexamined is the general question of whether the exploration of such simulated spaces can
be a spatial and visual experience.
This critical neglect of the visuality of IF seems unsurprising, given that one might
have difficulty identifying any visual aspects of the genre. What could be the importance
of visuality in a medium which, by definition, includes few or no visual images and relies
primarily on text? If we distinguish visual and textual signifiers according to the
definitions given above, the signifiers that make up a work of IF seem to fall into the
latter category, as their semantic value doesn't depend on their precise visible
instantiation. Contemporary IF interpreters give the player the option of altering details
such as the font, text color and background color, without altering either the precise
text that the program generates, or the code that generates it.
According to a common-sense understanding, the IF work
is the source code,
or perhaps the string of signifiers produced in the execution of that code, but not the
material instantiation of that code. Two players who play the same version of
Ad Verbum using the two sets of interface options shown in figures
1 and 2 are playing the “same” game; the differences in font and color
are purely cosmetic. This is analogous to the commonsensical assumption that the identity
of a literary text resides in the text — the ordered array of signifiers — and that the
material instantiation of those signifiers is merely a cosmetic feature.
[4]
Yet I argue, counterintuitively, that IF may be viewed as a visual and visual-verbal
genre. In the first place, and even before we consider the visual aspects of the IF
interface itself, a central element of nearly all works of IF is the ancient rhetorical
trope of ekphrasis. In ekphrasis, an absent object is described in terms which permit the
reader or listener to visualize that object, to “see” it in the mind's eye as if it were
physically present.
From the reader's perspective, the principal textual components of IF are room
descriptions and object descriptions. The basic purpose of both these types of texts is
designed to enable the player to visualize the phenomena described by the text. As Eric
Eve explains, in IF,
the physical world is generally modelled as
a series of discrete locations known as rooms. The totality of rooms in a
given work of IF is often referred to as the map. Such rooms could
correspond to rooms in a building, but they need not and frequently do not[...].
Conceptually, a room is that segment of physical space that is immediately accessible
to the player character.
[Eve 2007, para. 7]
(emphasis in original)
In other words, the typical arrangement of
space in IF is that the gameworld is divided or segmented into several discrete, mutually
exclusive chunks. Such a spatial arrangement is not unique to IF. The fifth item of Mark
J.P. Wolf's taxonomy of video game spatial structures is “adjacent
spaces displayed one at a time”
[
Wolf 2002, 59].
[5] In graphical
video games dating back to the late 1970s, such as
Superman
and
Berserk, “adjacent spaces or
rooms are displayed as a series of nonoverlapping static screens which cut directly
one to the next without scrolling”
[
Wolf 2002, 59]. However, in a text adventure game, by
definition, these chunks of space cannot be represented by onscreen images.
[6] Instead, a block
of onscreen text — the “room description” — is used to make the player
aware of the relevant properties of the present room, including the exits from that room
and the objects it contains. The room description might be said to take the place of the
absent graphical image of the room, although this formulation is anachronistic insofar as
IF predates graphical adventure games. Furthermore, the image of the room is not
“absent” in the sense of having been removed or abstracted, inasmuch
as it never existed to begin with.
Consider, for example, the following room description from
Zork I:
The Great Underground Empire:
You are in the living room. There is a
doorway to the east, a wooden door with strange gothic lettering to the west, which
appears to be nailed shut, a trophy case, and a large oriental rug in the center of
the room.
Above the trophy case hangs an elvish sword of great antiquity.
A battery-powered brass lantern is on the trophy case. [Blank et al. 1980]
This text names
the room and enumerates all the visible exits from the room (the doorway and the wooden
door) and the visible objects in it (the door again, the trophy case, the rug, the sword
and the lantern). These objects are all “implemented.” That is, they
are defined in the game’s source code as objects that have certain properties, one of
which is that the avatar may be able to interact with them. The description mentions no
objects that aren’t implemented (although room descriptions often do mention such
objects), and it does not fail to mention any visible objects that are implemented.
The qualifier “visible” is necessary because there's a trap
door under the rug. This object is left unmentioned because on first entering the room,
the avatar can't see it. Finding the trap door (by moving the rug) is a puzzle. The player
may well know about the trap door before moving the rug, perhaps from having played the
game before, but such knowledge does not extend to the avatar. If the player inputs a
command referring to the trap door before moving the rug, the game responds, “You can’t see any trap door here!” In this case the player may be
able to visualize the trap door under the rug, and perhaps the avatar can even imagine
that there's a trap door there, if we imagine the avatar as being capable of having
cognitive operations that the player doesn't share. However, the avatar still can't
see the trap door in the sense that it is not physically within his or her
visual field.
[7] Thus the room
description represents what the avatar, not the player, sees when he or she looks around
the room. It is a translation of the avatar's direct visual experience into words. The
player then has the opportunity to back-translate those words by activating the faculty of
readerly visuality — by forming an imaginary visualization of the things the avatar
sees.
The primacy of
seeing in IF is indicated by the ubiquitous presence of light
sources in
Adventure and games descended from it. Exploration
can't take place in the absence of light, and light source conservation and transport are
common puzzle themes. As Jeremy Douglass observes, this made sense in
Adventure
“as it is highly dangerous to wander around cave systems in the
dark”
[
Douglass 2007, 132], but the need for light sources
subsequently became divorced from its original context and evolved into a generic
convention. Games like Taro Ogawa's
Enlightenment (1998),
where the player's goal is to extinguish all the light sources in a room, or Andrew
Plotkin's
Hunter, in Darkness (1999), where exploration takes
place via senses other than sight, are deliberate reactions against this primacy of sight
[
Douglass 2007, 134]. The default assumption in IF is that the avatar
experiences the gameworld through the visual faculty, and that the text presents the
avatar's visual experience to the player.
As translations of visual objects in the medium of language, IF room descriptions (and
object descriptions, of which room descriptions are special cases) are examples of
ekphrasis. In current critical discourse ekphrasis is most often defined as the verbal
description of a visual work of art, but Janice Hewlett Koelb argues that this meaning of
the term is a twentieth-century invention, dating back no earlier than Leo Spitzer’s 1955
essay on Keats's “Ode on a Grecian Urn”
[
Koelb 2006, 2]. Ancient rhetoricians defined ekphrasis as “[a] speech which leads one around (
periegematikos
) bringing the subject matter vividly
(
enargos
) before the
eyes”
[
Koelb 2006, 23], whatever that subject
matter might be. IF games like
Zork certainly meet this
definition. The degree of vividness (or
enargeia
) with which the subject matter is “brought before the eyes” is a factor that varies between different games, and
also between different players, since players might mentally visualize the gameworld more
or less visually depending on how visually inclined they happen to be. On an anecdotal
level, I tend to visualize extensively when I play IF games, but I know other IF players
who claim that they don't do so, and that they understand room descriptions in a
conceptual or propositional way. However, I suggest that IF games must supply the
potential for visualization in order to provide a meaningful play
experience.
[8] What we might call
visualizability is a basic requirement for traversing most if not all
interactive fictions, especially those that include multiple rooms or rooms with multiple
objects in physical contact with each other. In order to productively interact with the
gameworld, the player must possess at least a minimal understanding of the spatial
relationships between the objects in each room and between the rooms themselves. This
requires constructing a mental (or actual) map, which is, to a substantial degree, a
visual operation. As Eve observes, “[t]he totality of rooms in a
given work of IF is often referred to as the
map
”
emphasis in original [
Eve 2007, para. 7]
and this is “probably because someone
designing a work of IF containing more than a handful of rooms almost certainly needs
to draw a map indicating their spatial relations before attempting to write the game,
and players often find it useful to draw schematic maps as they play”
[
Eve 2007, fn5].
When visualizability breaks down — that is, when room and object descriptions fail to
accurately represent what the avatar can see — meaningful play and the ability to traverse
the game successfully may be impeded.
[9] This may happen, for example, when an object
mentioned in a room description is not implemented. By convention, if the player tries to
interact with such an object, the game responds that the object is not important.
Sometimes, however, the game fails to acknowledge the object’s existence and instead
outputs a standard response to commands that reference nonexistent objects, such as “You can’t see any such thing” or “I don’t see
that here.” This behavior is generally considered a design flaw or even a bug, as
Eve explains: “It looks very clumsy if, having told the player that
the room is decorated with striped wallpaper, the game responds with ‘You see no such
thing’ when the player tries to examine it”
[
Eve 2007, para. 15].
[10] Such behavior creates a gap between the visual experience
of the avatar and the verbal experience of the player. Somehow, the player can read about
things the avatar can’t see, and this destroys the illusion that the room description
represents the avatar’s visual experience.
An opposite but perhaps more egregious breach of visualizability occurs when the text
fails to mention objects that are implemented and that the avatar should be able to see.
For example, in Dave Baggett and Carl de Marcken's 1994 game
+=3, the avatar must give three objects to a troll as a toll to cross a bridge.
The INVENTORY command reveals that the avatar is holding just one object, and the game's
single room contains no other objects that can be acquired. The solution is to take off
the avatar’s shirt, shoes, pants, socks, glasses and/or underwear, thereby supplying the
missing two items. This solution, though perfectly logical, is cruelly unfair because none
of these articles of clothing are referred to anywhere in the game.
[11] In particular, they aren't mentioned in the responses to the commands
INVENTORY and EXAMINE ME. According to conventions which were well established by 1994,
experienced players would thus conclude that the avatar was wearing nothing important,
because on looking at himself or herself, the avatar sees nothing worth mentioning. The
player would assume that the avatar is wearing clothes (otherwise the avatar's nudity
would be mentioned), but that the clothes have no relevance to gameplay. Objects left
unmentioned are assumed to be below the avatar’s perceptual threshold, and thus either
nonexistent, or irrelevant to the task of traversing the game. The underlying assumption
here is that everything the avatar sees will be translated into descriptive text. In
violating this assumption,
+=3 precludes meaningful play.
Thus, IF is an ekphrastic medium because it consists of texts which describe visual
phenomena and which prompt the reader to create imaginary visualizations of those
phemonena. However, IF difers from other ergodic media by virtue of being
prescriptive rather than
autotelic. The reading of a static
ekphrastic text, like Diderot's
Salons or Ruskin's
word-paintings, is a self-contained experience.
[12] These texts describe absent
visual phenomena in such a way as to permit the viewer to visualize them, but they do not
prompt the reader to take any action in response to these visualizations. The experience
of imagining what the text describes is its own reward. By contrast, when an IF player
reads a room or object description, he or she is expected to take an action in response
(i.e. to do work, hence the term
ergodic). The player is prompted to give
commands to the avatar based on the visual and other information in the description.
My argument, thus, is that ekphrasis is the characteristic mode of visual representation
in IF. During the commercial era of IF (approximately coinciding with the lifespan of
Infocom, from 1979 to 1989), ekphrasis, as a means of visual rendering, had certain
comparative advantages over graphics. Graphical video games predate
Colossal Cave by at least 15 years, but these games ran on mainframes or
dedicated arcade machines. The creation of sophisticated graphics was beyond the
technological capabilities of contemporary home computers. Displaying text was much less
labor-intensive. For example, the first commercially successful personal computer was the
Osborne 1, released in 1981. This computer had a monochrome screen which was incapable of
displaying bitmap graphics [
Wikipedia 2009]. On such a platform, a visual
depiction of a building with keys, a brass lamp, food and water on the ground would have
been out of the question. Text made it possible to “show” visual
phenomena that could not have been depicted with the graphic resources available.
Furthermore, text was far more cross-platform than graphics. Infocom games were designed
for the Z-Machine, a “software computer [which] could be implemented
on many different platforms, including almost all of the popular microcomputers in the
United States during the 1980s” including business machines as well as dedicated
gaming machines [
Montfort 2005, 126]. Since all of these computers were
capable of displaying text, all the Infocom games could be ported to any platform at once
simply by writing a new implementor for that platform. The use of graphics, by contrast,
would have made such cross-platform availability an insurmountable obstacle.
[13]
For these and other reasons, the use of ekphrasis rather than graphics made the
commercial success of IF possible. According to the standard view of the genre’s history,
however, IF's reliance on text was also the cause of its commercial decline.
[14] Over the
course of the 1980s, as the graphical capabilities of home computers advanced, the new
genre of the graphical adventure gradually rendered IF obsolete.
[15] According
to Espen Aarseth, this was a natural succession because graphics, compared to ekphrasis,
are a naturally superior mode of visuality: “Images, especially moving
images, are more powerful representations of spatial relations than texts, and therefore
this migration from text to graphics is natural and inevitable”
[
Aarseth 1997, 102].
[16] By Aarseth's logic, the purpose of a
game is to serve as a transparent window into an imagined space. According to what Bolter
and Grusin call the logic of transparency [
Bolter & Grusin 2003], the game seeks to
erase its own materiality and present the player with a vivid, sensuously present
experience of existence in another world. For this purpose to be fulfilled, the gameworld
must be presented with maximum visual richness. Clearly games that translate the avatar's
visual experience into text do all these things less effectively than games that display
the avatar's visual experience onscreen.
The assumption here is that video game history follows a teleological progression from
lesser to greater transparency. IF becomes commercially unviable because it represents an
earlier stage in this progression. For some authors, this is only natural: the fact that
computer graphics have outstripped the capacities of IF is cause for celebration. An
example of such a view is Julian Dibbell's dismissive description of
Adventure as an inferior precursor to
Myst:
“It's hard to believe that that world once represented the
high frontier of computer gaming. Where players of latter-day quests like
Myst point-and-click their way through complex graphical
environments of an almost liquid radiance […]
Adventure
was strictly hunt and peck”
[
Dibbell 2001]. Other authors characterize the gaming industry's
ideology of transparency as unfortunate, and describe IF nostalgically as having been
sacrificed on the altar of progress. Aarseth regrets that the text adventure game, a
“young, vigorous, if somewhat bland tradition of textual
entertainment [...] was quickly overrun by the entertainment market”
[
Aarseth 1997, 128]. More recently, Andy Klien began a 2005
article on IF by writing, “Only once in my life have I seen a
wonderful medium effectively wiped out by new technology”
qtd. in
[
Douglass 2007, 21–22]
.
Yet interactive fiction still exists today, when the graphical capabilities of personal
computers are far more sophisticated than at the time of IF's commercial collapse. New IF
games are now produced by independent hobbyists and artists rather than by commercial
firms. However, for contemporary IF authors graphics represent an elephant in the room, a
topic that may not be directly discussed but that can't be ignored. Authors of IF in the
post-graphical era cannot avoid the question of why they should bother, since graphics are
now better than IF text at doing what IF text does, for which reason IF will probably
never again be a commercially viable medium. By way of answering this question, IF authors
and critics have sought to claim for IF another type of legitimacy, emphasizing its
aesthetic and scholarly appeal rather than its commercial appeal. If IF can't be a popular
and commercial medium, it can be an auterist and artistic medium. But in order to prove
the aesthetic legitimacy of IF, it becomes necessary to show that IF is an independent
medium from the graphical video game because IF text has properties that graphics
lack.
Where contemporary IF authors and critics differ is in their conception of the precise
nature of these distinctive properties of IF. Within contemporary IF work we can
distinguish two very different approaches to defining the specificity of the genre. The
first approach is to argue that IF is a linguistic and anti-visual medium.
Ad Verbum: Interactive Fiction and Representational Friction
One way in which IF responds to the seemingly superior representational capabilities of
text is by ignoring ekphrasis almost entirely and foregrounding the textual and verbal
qualities of the IF interface. The paradigmatic example of this approach is Nick
Montfort's 2000 game Ad Verbum.
The player's goal in this game is to remove all the objects from a house belonging to the
Wizard of Wordplay. Nearly all of the game’s puzzles must be solved by entering commands
according to various linguistic constraints. Exploiting Bolter and Grusin's logic of
hypermediacy, this game forcibly reminds the player of its nature as a text-based computer
program, rather than a window into a simulated world. This is evident immediately in the
introductory text of the game:
With the cantankerous Wizard of
Wordplay evicted from his mansion, the worthless plot can now be redeveloped. The
city regulations declare, however, that the rip-down job can't proceed until all the
items within have been removed.
That's what the demolition contractor explains to you, anyway, as you stand eagerly
on the adventurer's day labor corner. Once he learns of your penchant for
puzzle-solving and your kleptomaniacal tendencies, he hires you for the job. You hop
into the bed of his truck, type a few Zs, and arrive at the site, eager
… [Montfort 2000]
“Z” is the standard abbreviation for the
“wait” command, so the last sentence erases the boundaries between
player and avatar, between typing commands and performing actions. Throughout the game the
player is consistently reminded that he or she is not exploring a diegetic world, but
typing commands in response to verbal descriptions. Some of
Ad
Verbum's puzzles in fact involve no interaction with objects or spaces, only
manipulation of language. For example, on the first floor of the mansion, the player
encounters a little boy, Georgie, who refuses to give up his toy dinosaur unless the
player can name more dinosaurs than Georgie can. Georgie knows an arbitrarily large number
of real dinosaur names, so the solution is to input fake dinosaur names — i.e. nonsense
words ending in “saur” or “saurus” — until Georgie
gets frustrated and gives up. Since all the player has to do to solve this puzzle is think
of nonsense words, it doesn't matter whether or how the player visualizes the space where
Georgie is located.
Other puzzles in the game do force the avatar to interact with rooms and objects, but in
order to make the avatar do so, the player has to satisfy certain linguistic constraints.
Most notably, the game contains several “constrained rooms” where the
output text consists entirely of words starting with a specific letter. For example, at
the bottom of
figure 4 we see the initial room description
of the “Wee Wardrobe.”
This same
constraint applies to the player's input. Obvious solutions like TAKE WEAPON don't work;
if the player enters a command containing a word that doesn't start with W, the parser
replies, “Wha? Wha? Withhold wrong words. Write wholesomely.”
The puzzle, therefore, is to command the avatar to take the two objects in the room and
then leave, using only words beginning with W.
[17] This constraint applies even to nondiegetic
commands like HINT, SAVE, RESTART, RESTORE and QUIT, and on first entering a constrained
room, the player must read a warning alerting him or her to this fact.
The constrained rooms call attention to the fact that the world of this game is a
linguistic construct, a tissue of words and letters. Of course, this is true in a sense of
the diegetic world of any IF game: the white house in
Zork
doesn't exist independently of the language that describes it.
[18]
Ad Verbum’s innovation is to make explicit the linguistic
nature of the IF gameworld. Since the spaces of
Ad Verbum are
called into being by language, it's logical that these spaces can have linguistic
properties, like the property of only containing objects that start with W. However, by
virtue of being defined in purely verbal terms, these spaces resist translation into
images. What would a room would look like if it contained only things beginning with S?
The first letter of an object’s name is not a property which can be perceived by looking
at it, especially if the object has various possible names. One can imagine a space based
on the physical form of a letter — for example, an S room where the walls, ceiling and
furniture have sinuous, snaky curves, or a V room full of sharp, severe triangles. But
there is no suggestion that the constrained rooms in
Ad
Verbum are organized according to the visual properties of their corresponding
letters. These are entirely linguistic spaces, and the language of which they are composed
is in a sense stripped of visuality. In
Ad Verbum, a letter
is defined purely in relational terms, as a member of a set with 26 members. The question
of the physical instantiation of letters is ignored.
[19]
If descriptions in IF are translations of what the avatar sees into words, the Ad Verbum avatar sees things that can't be seen — for example,
what letter an object starts with, or whether it contains the letter E. This avatar’s
visual experience is fundamentally anti-visual. So the game frustrates the player’s
ability to imaginatively reproduce the avatar’s visual experience. If the things the
avatar “sees” are unseeable, the player can't imagine what it's like to
see those things. This forcibly reminds the player that IF is at bottom a linguistic and
programmatic rather than a spatial experience.
Montfort thereby demonstrates that the world represented in an IF game is dissimilar to
the material, namely language, that represents that world. This is what James Heffernan, a
scholar of ekphrastic poetry, describes as the trope of representational friction, in
which the ekphrastic poem calls attention to the artificiality of the artwork it describes
[
Heffernan 2004, 4, 18–19, 37]. For example, Homer's description of
the shield of Achilles includes the statement that “the earth
darkened behind [the ploughmen] and looked like earth that has been ploughed / though
it was gold ”
[
Heffernan 2004, 19]. At the same time that Homer celebrates
the amazing power of art to reproduce reality, he reminds the reader that the work of art
is ontologically dissimilar to the reality it reproduces. Homer celebrates “the wonder [...] of graphic verisimilitude” specifically by telling
the reader “that what appears on the shield is not the ploughed
earth itself, but gold that has been somehow made dark enough to resemble it”
[
Heffernan 2004, 19]. Because the shield is made of gold, not
dirt, it can represent dirt only via artifice and convention. By analogy, because poetry
is made of language and not images, it can represent images only through a similar
artifice. Representational friction, thus, is a trope that foregrounds the dissimilarity
between the descriptive poem and what it describes. It reminds the reader that the poem is
a poem, not a painting or sculpture: that the reader is not beholding a physically present
picture, but imagining a picture based on his or her interpretation of graphic signifers.
Representational friction reminds the reader of the nature of the activity he or she
performs in reading a poem. It defines the specificity of poetry as distinct from painting
and sculpture.
But of course IF players perform an activity that readers of poetry typically don't. In
IF, the player does more than interpret signifiers; he or she also enters commands in
response to those signifiers. These commands produce changes, often of a permanent nature,
in the diegetic gameworld, and thereby determine what signifiers will be given for
interpretation next. Montfort also reveals the verbal nature of the process of entering
commands. The standard conceit is that when the player types a command, this is equivalent
to, and can be visualized as, the avatar performing that action. When I type “take lantern” and press the enter key, I may imagine that my avatar
reaches out his or her hand and takes the lantern. Of course, what actually happens is
that the game program interprets the words “take lantern” as an
action, then checks for whether the action can succeed or not in the present condition of
gameplay. If it can succeed, the lantern is moved from its current position and added to
the player's inventory [
Nelson 2001, 87]. But when Montfort places
constraints on the player's ability to enter commands, he reminds the player that commands
don't actually involve interaction with objects in or attributes of a diegetic world; all
they involve is the generation of signifiers. One puzzle requires the avatar to acquire
four books using commands that follow the linguistic constraints used in the text of the
books. For example, the “dust casing” does not accept commands that
include the letter E, and the “abecedarian book” only accepts commands
in which the first word starts with A and the second word starts with B. If the player
tries to take these books using inappropriate commands, “a mysterious
force holds the book to the … shelves.” Possible solutions include ACQUIRE BOOK
and LIFT CASING.
[20]
In the context of obtaining a book, the words TAKE, GET, ACQUIRE, and LIFT all describe
the same action. When I pick up a book, I can use any of these verbs interchangeably to
describe what I'm doing. But in Ad Verbum, the “mysterious force” that governs the books will accept only some of
these actions and not others. The force allows the avatar to rip the casing or uproot the
copybook but not take or get them, merely because the former two actions satisfy the
constraint and the latter two don't, even though the four actions are not semantically
distinguishable and can all be visualized in the same way. Here Montfort is deliberately
subjecting the player to the notorious “guess the verb” situation,
where the player knows what he or she wants the avatar to do, but has difficulty finding
the specific verb that tells the avatar to do it. When this phenomenon occurs in games,
players typically see it a design flaw, because it violates the logic of transparency. In
real life, if one knows what one wants to do and if one is physically capable of doing it,
one can simply do it. In a graphical video game, the player can just press the button that
makes the avatar take the desired action. So why should it be any different in an IF game?
Though this is a rhetorical question, Montfort answers it by arguing that an IF game does
not follow the procedures of real life, nor those of a graphical video game. An IF game is
neither the real world nor a transparent representation thereof, but rather a computer
program in which both the input and the output consist entirely of text.
In Ad Verbum, representational friction and guess-the-verb
puzzles ultimately serve to define the specificity of IF as opposed to graphical video
games. Since IF is clearly incapable of competing with graphical video games in terms of
commercial appeal, Montfort seeks to claim for IF another type of legitimacy in terms of
aesthetic or academic appeal. Montfort does this by stressing that the visual and spatial
aspects of IF are metaphorical, not literal, because IF is a fundamentally linguistic
medium. IF is an independent and aesthetically legitimate medium because of, not despite,
its lack of graphics. Contemporary IF is not an atavistic throwback to the era before the
graphical video game, but an artistic medium in its own right. By situating IF as a
textual medium, Montfort is also able to connect it to earlier, more canonical forms of
ludic textuality. Thus, Ad Verbum contains explicit
references to famous constrained texts like Walter Abish's Alphabetical Africa and Georges Perec's La
Disparition. In Twisty Little Passages, Montfort
continues this project by arguing that IF has important similarities to the literary genre
of the riddle.
Montfort doesn't refute the allegation that computer graphics are more effective in some
ways than words at representing the contents of fictional spaces. He tacitly accepts this
critique and suggests that the true strength of IF lies elsewhere, in its ability to
manipulate the material of language, an ability that graphical video games lack. If the
graphical video game is a visual medium, then IF is a textual medium. Visual effects are
the proper province of graphical games, while textual effects are specific to IF.
A similar strategy is at work in many other more recent games that exploit the textual
properties of the IF browser, although I don't know of any other game that does this to
the same extent as
Ad Verbum. For example, Jeremy Freese's
Violet, the winner of the 2008 Interactive Fiction
Competition, features a parser which is personified as the avatar's eponymous girlfriend.
This effect is possible in IF because the parser is simultaneously the voice of a narrator
and the means by which the diegetic world is presented to the player. The parser not only
narrates the events of the gameworld, but actually produces that world for the player. In
graphical video games, these two functions are separated. If
Violet were a graphical game, Violet would be no more than what André
Gaudreault calls a delegated narrator (see [
Gaudreault & Barnard 2009, 135–146]).
It would be difficult to create the illusion that Violet was actually creating the
gameworld by speaking about it.
Moreover, if IF is an independent artistic medium in its own right, rather than an
atavistic precursor of graphical video games, then it becomes reasonable to use IF for
purposes other than gaming. This is the idea behind the genre of puzzleless IF, which uses
IF scripting languages but often abandons the elements of spatial exploration and
puzzle-solving. The classic example of puzzleless IF is Adam Cadre's Photopia (1998) and the genre also includes sophisticated chatbots like Emily
Short's Galatea (2000).
But Montfort's strategy of stressing the linguistic and anti-visual properties of IF is
only one way of arguing for the aesthetic legitimacy of the genre. Another approach is to
argue that IF is in fact a visual genre, but that it possesses a type of visuality which
is in some degree unavailable to graphical games. By coincidence, one of the key advocates
of this approach to IF is the aforementioned Emily Short.
[21] Eve probably chose to use wallpaper as an example because of Andrew
Plotkin’s game
Delightful Wallpaper, in which the avatar is
an incorporeal ghost, and is thus unable to interact with the titular wallpaper or with
any other object. Nonetheless, Plotkin includes many implemented objects in the game and
goes to the trouble of including descriptions for all these objects. According to one
reviewer, it was precisely these descriptions that made Plotkin's game more than a mere
puzzlefest [
Bond 2006].
Affective Ekphrasis in City of Secrets
City of Secrets (2003) is a game about spaces. For most of
this game the avatar's goal is simply to explore the setting of the game, known simply as
the City, in order to find a mysterious woman named Evaine. The game's puzzles are mostly
about overcoming barriers to further exploration, and the primary reward the player gets
for solving these puzzles is the ability to explore previously unseen spaces. The City
itself is inherently worth exploring because it's a tourist destination, a place of great
historical and cultural importance. Short's innovation in City of
Secrets is to encourage the player to see this space rather than simply read
about it. Short's descriptive language is precise and detailed, but also deliberately
limited in terms of what it reveals. However, by deliberately limiting the visual
information she provides, Short encourages the player to supply this information by
exercising the faculty of readerly visuality.
The descriptions reproduced in
figure 5 accomplish the
primary practical tasks of an IF room description: they enumerate the exits from each room
and the implemented objects in them, thereby making this part of the game's geography
visualizable. However, the descriptions are in no way ultraprecise; they provide
insufficient information to permit the player to visualize exactly what these spaces look
like. Short neglects to describe the architectural style of the buildings or to specify
the number of buildings or the things depicted in the statues. This omission of detail is
a deliberate choice on Short's part, since she has also written descriptions which are
obsessively detailed. Her 2000 game
Metamorphoses contains a
number of murals which can be both examined and looked at through a magnifying glass,
revealing additional details which can themselves be examined. Short comments, “In writing
Metamorphoses I did think of
what I was doing as specifically ekphrasis, and that’s one reason there are so many
layers of detail within the scenery, especially the murals: I was trying to capture a
little of the sense, found in Ovid and Catullus, that worked pictorial objects have
astounding levels of detail”
[
Short 2009].
What happens instead in
City of Secrets is that the omission
of details from the text creates gaps in the player's visualization of the scene, gaps
which the player then has the opportunity to fill. As Wolfgang Iser has argued, filling in
gaps in a text is one of the major cognitive operations performed by readers. Iser
characterizes this process as a propositional or linguistic one, but Peter Schwenger, a
theorist of readerly visuality, suggests that readers perform this process with images as
well as words. Schwenger notes that Iser “speaks of syntheses
below the level of consciousness, which he calls ‘passive syntheses’. Of such
syntheses the basic element is the image”
[
Schwenger 1999, 57]. Another way to theorize this process is
through Scott McCloud's concept of closure, the process whereby the reader of a comic
creates mental images that fill the gaps (or gutters) between the comic's panels [
McCloud1993, 66–68]. If the concept of closure was designed to account
for texts that consist of sequences of images, then it applies to the IF text insofar as
IF, as encountered by the player, involves precisely such a sequence.
[22] As explained above, in playing IF the player is presented with a series of visual
experiences translated into verbal terms. Closure is what sutures the gaps in this
sequence of disparate images.
Schwenger and Iser's visual “filling in,” which operates when
we read a verbal narrative, is closely analogous to McCloud's “closure,” which operates when we read a narrative composed of images. Both these
modes of reading involve a synesthetic interplay between the viewer's imagination and the
signifiers of the text, whether these signifiers are defined as visual or verbal in
nature. Indeed, the similarity of “closure” to “filling in” suggests that these two modes of reading are less
distinct than they may appear — that the decision of whether to define a narrative as
visual or verbal is to some extent an arbitrary decision, one which is influenced by
cultural politics as well as by the phenomenology of the reading experience. Even if we
choose to define IF as a genre that employs purely verbal means, the experience of playing
IF may not be all that different from the experience of playing a game that employs
(ostensibly) visual means.
Playing IF, then, could be as much a visual experience as playing a graphical video game.
However, that doesn't rule out the possibility that these two experiences could be visual
in different ways: the visuality of IF might differ from the model of visuality associated
with graphical video games. As early as 1983, Infocom took precisely this position,
arguing in an advertisement that their games “unleash[ed] the world's
most powerful graphics technology,” i.e. the human brain: “We
draw our graphics from the limitless imagery of your imagination — a technology so
powerful, it makes any picture that's ever come out of a screen look like graffiti by
comparison.”This argument, however, still adheres to the logic of transparency:
it holds that imagined visuality is more transparent than graphical visuality and
therefore better.
A more nuanced way to distinguish between readerly and graphical visuality might be to
emphasize the personal, subjective or affective aspects of the former. For Schwenger,
reading is necessarily accompanied by a continuous passive process of image generation,
but the reader's preexisting visual inclinations and his or her mental repertory of visual
images affect the way in which he or she concretizes the text's descriptions:
[L]iterature consists of a steady stream of erased
imperatives, according to Elaine Scarry, imperatives that are often
instructions to produce mental pictures. Yet no matter how detailed or precise those
instructions may be, they are never comprehensive enough to override the individual’s
memory bank of images and associations. These play upon the author’s dictated
pictures, an obbligato of the unconscious, of memory and desire. [Schwenger 1999, 4]
Even if Short's room descriptions were more
detailed than they are, they would be unable to supersede the reader's preexisting mental
pictures of analogous rooms; for example, however Short described the Sun Court temple, I
would inevitably imagine it as looking like the U.S. Capitol. (By contrast, when I visit a
similar location in a graphical video game — say, the Bevelle Temple in
Final Fantasy X — I see
only what the game designers
want me to see, and I see the same temple as every other player. The way I understand this
visual image is specific to me, but the way I visualize it is not.) What Short does do,
however, is to condition how the player sees whatever it is that he or she sees, to
suggest the affective resonances of the mental pictures that the player may form. The
effect of Short's descriptions say less about what precisely the avatar sees than about
how the avatar is affected by what is seen, as Short notes: “With
City of Secrets, though, it’s true that I was trying to
do something a little bit different [as compared to
Metamorphoses]: to
hint at the protagonist’s perceptual filters by describing styles and trends rather
than straightforward physical detail”
[
Short 2009].
For example, the description of the mosaic in the Sun Court reads, “The mosaic is an elegant job and executed in rich materials, but the design has a
facile modern quality that does not entirely appeal to you.” The temple is
described as “[b]uilt in an old style, but unworn, unchipped,
unpolluted.” Combined with the profusion of illusionistic artwork in this area of
the City, especially the façade-painting, these descriptions suggest that the Sun Court is
an insincere place. It is recognizably less ancient than it appears to be. This suggests
that the City's government, of which this space is the public architectural symbol, is
trying to pass itself off as something it's not. Inasmuch as it is conditioned by such
hints as these, the player's visualization of this space becomes affectively charged. As a
counterpoint to this, here is Short's description of a nightclub called Scheherazade:
Despite the light that leaks in through the windows, the place seems
to be trying for a dark and anonymous ambiance, with high-backed booths and wood
paneling, a ceiling painted black, and hanging swatches of brocaded purple velvet. The
decorations are mostly allusions to the City's distant shady past as an outpost of
thieves and smugglers on the Vuine.
Most of these details, again, are not
relevant to completing the game, but they assist the player in creatively visualizing the
place. The few details that Short does provide — the black ceiling, high-backed booths,
and purple velvet — hint at what gives this place a “dark and
anonymous ambiance,” but the player is invited to fill in the remaining details
in his or her own way. The decorations, involving thieves and smugglers, suggest why the
place is “trying for” such an ambiance: it is a place of
darkness, of secrecy and anonymity, a hideout for outlaws or at least for people who have
something to conceal. But at least this is a place that doesn't seek to present itself as
something it's not.
What all these descriptions do is to condition how the player visualizes the room. They
add an affective dimension to the mental picture of the room that the player involuntarily
creates for himself or herself in response to the textual representation of the room.
This effect is further complicated by Short's limited use of graphics.
City of Secrets includes a frame containing images, located to the
left of the main gameplay window. However, these images are more suggestive or symbolic
than mimetic. They suggest the dominant mood or tonality of the scene the player is
witnessing, rather than showing anything in that scene. Accordingly, Jeremy Douglass calls
the images in this game “ambient illustrations”
[
Douglass 2007, 45]. In
figure 5,
for example, we see a stylized representation of the sun against a field of orange fading
into white. This image doesn't depict anything in the Sun Court, except perhaps the sun
symbol on the pavement, but it suggests the offputting, blinding sunniness of the
scene.
[23] What we see
here is a complex, synaesthetic interplay between the images described
in the
text and the images that the text
is. The actual images help to shape the
player's mental images, at the same time that the latter inflect the player's
interpretation of the former.
This is a text that attends to the way in which text is inescapably a visual phenomenon.
In this context it's worth noting that although City of
Secrets allows the player to change the font, text color and other such options,
the title screen and the left-hand window include text which is not affected by such
changes.
Without speculating on the metaphorical associations of this font, I merely note that it
was chosen deliberately. The player enters this game through the threshold of an image
which is primarily composed of textual signifiers, yet contrary to my commonsensical
definition of text, the precise visual instantiation of these signifiers is
clearly important.
For a certain subset of the game's audience,
City of Secrets
was an even more material and visual experience than it is today. On releasing the game,
Short offered players the opportunity to purchase a special edition of the game that came
with a boxed set of “feelies.” The term
feelies refers to
“[m]ultimedia epitexts such as journals, maps, and artifacts,
bundled to illustrate the IF work. Popularized by Infocom”
[
Douglass 2007, 392]. Commercial IF games were physical
artifacts — floppy discs packaged in boxes and sold in brick-and-mortar stores — and the
inclusion of feelies further intensified the physicality of those objects. (Feelies served
the additional practical function of copy protection; games like
Sorcerer and
Leather Goddesses of Phobos were
unsolvable without information which was printed on the feelies, and which, in a pre-World
Wide Web era, would have been otherwise unavailable.) This physical side of the IF
experience was lost when IF moved to a digital model of distribution. Seeing this as an
unfortunate development, Short helped to create a website,
feelies.org, that produced and distributed feelies for
contemporary works of IF:
feelies.org started with a conversation that I had with some of my friends in
the IF community, about how the one aspect of commercial IF we really missed (as
players) was the feelies. Some modern IF comes with “virtual
feelies” — PDF files or fake Websites or whatever that are distributed in
a Zip file with the game — and I like those, but we were also missing the tangible
physical objects.
[Loguidice 2004, n.p.]
The
City of
Secrets feelies included such items as a “[t]ourist guide to
the City, including map, digitally-offset printed by Imagers.com in full color on glossy
paper” and a “[q]uantity of dried liontail in a labeled
plastic bag, contained in velvet and/or satin gift bag from boutique magic shop.”
For players who did not purchase the paper feelies, Short also created an online website
for the Southern Light Rail company (this website is now defunct, but has been cached by
the Internet Wayback Machine). This website prominently features the same font used in the
game's title screen.
The fact that Short paid so much attention to the physical and material aspects of City of Secrets indicates that for her, the visual instantiation
of an IF game is not an irrelevant cosmetic detail. It directly influences the player's
experience of the game, an experience which is visual in multiple senses. The visuality of
City of Secrets results from a collaboration between the
preexisting visual memory of the player and the visual details, verbal and graphical,
supplied by the author, as focalized through the “perceptual filters”
of the avatar — who, unlike the avatars in Zork and Ad Verbum, is a well-defined character with a particular
personality and history. The visual experience of this game depends on a complex and
shifting interplay between the player's visual memory, the details the author provides via
the protagonist-avatar, and the imagetextual aspects of the gaming itself.
Now I suggest that such a visual experience has little to do with transparent immediacy.
A transparent visual representation, by definition, is minimally mediated; it presents the
visualized scene without distorting filters, so that it looks the way it would if it were
present before the viewer. The goal of Short's language in this game is not to create such
visual representations. In a text-based game, the only way to create such visual
transparency would be to provide a large amount of precise descriptive detail, so as to
permit the reader to imagine exactly what every aspect of the scene looks like. However,
Short argues in her blog post “The Prose Medium and IF” that such “detail for detail's
sake” is unnecessary and potentially harmful in IF, where ekphrasis is
prescriptive, not autotelic.
[24] The purpose of details in IF prose is to give the player
the information he or she needs to complete the game. Players are expected not just to
process the details but to use them as a guide for how to interact affectively with the
game's operations and its diegetic world. Providing excessive detail would be distracting
and tiresome. Short explains, however, that detail can do something else:
Some of the most effective writers of mood create their effect not with a
large number of common details (the flowers are red, the door is yellow, etc) but with
a small number of very particular ones; and I think that that is especially true in
IF. Words in interactive fiction individually carry more weight than they carry in
static prose, if only because of the amount of attention we demand the player give to
each one. […] I think I would find [P.D. James's descriptions] to be overkill in an IF
game. They’d need to be shortened and focused, because each sentence would do the work
of three or four sentences in the static prose version. In this respect IF is closer
to poetry than to conventional prose: it is worth taking more time to select fewer
words, because each one will be inspected through a jeweler’s loupe.
[Short 2008, para. 19, 20]
Short suggests here that the purpose
of details in IF is not to create a vivid, immediate and sensuously present mental picture
of a scene, but to suggest the mood associated with that scene. It does this by providing
sparse but carefully selected details, which serve the player as building blocks around
which a more complex and personal vision of the scene can be created.
[25] When Short mentions that Scheherazade has high-backed booths, a
dark ceiling, and decorations that show thieves and smugglers, she does more than simply
inform us that these things are present; she also hints at the affective resonance of this
place. She doesn't tell us what precisely this place looks like, but she provides us with
affective lenses that we can apply to our own visualization of the place. Short's goal in
this game is not to match the transparency of graphical video games, but to activate a
mode of visuality which is affectively rather than sensuously vivid. Ekphrasis has been
used for this purpose since ancient times: Quintillian wrote, for example, that lawyers
should use ekphrasis only where “motivated […] by the speaker’s
emotional engagement with and amplification of his client’s plight”
qtd. in [
Koelb 2006, 29]
. For ancient
rhetoricians, ekphrasis was not a transparent means of visual representation but a tool
for augmenting the emotional resonance of the described scene.
City
of Secrets suggests that this effect becomes, if anything, more potent when the
described scene is an interactive one.
In
Ad Verbum, the player needs to directly engage with the
verbal properties of IF in order to finish the game.
City of
Secrets doesn't similarly require the player to visualize in order to complete
the game (except at the minimal level described above with reference to
Zork), but this is because
City of
Secrets is a deliberately simple game. As Short writes in the game's ABOUT text,
“This game is meant to be playable even by someone who has never
encountered interactive fiction before, and be a gentle introduction to the genre. It is
not terribly difficult, nor is it possible to die until the very end.” However,
her other works do often require the player to visualize and to do so in an affective and
critical way. In
Savoir Faire (2002), a deliberately
challenging adventure game, the player has the magical power to create
“links” between two similar objects, whereby one object takes on the
properties of the other or is affected by events that occur to the other. In order to use
this power effectively, the player has to observe visual (and other) similarities between
the two objects, and this may require a minute inspection of the two objects involved. For
example, the first puzzle in the game is to open a locked pair of doors.
[26] The
description of the doors reads, “A pair of white-painted doors that
lead into the upstairs corridor of the house. Each door panel is decorated with the
family crest, picked out in ostentatious gold, as though to warn servants not to wander
that direction uninvited.” In a nearby room the player finds a teapot, whose
object description reads, “In order to make the linkages possible,
however, it has been painted a glossy white, and the crest of the family executed on one
side in intricate detail.” The solution to the puzzle is to link the doors to the
teapot, then open the lid of the teapot, causing the doors to open. This works because the
teapot and the doors are both white, openable, and decorated with the same crest. To
notice these similarities, the player has to read the descriptions of both objects “through a jeweler's loupe.” In doing so, the player may visualize
the two objects, but even if the player doesn't do this, the player's activity of closely
reading the descriptions is equivalent to the avatar's activity of closely examining the
objects. Solving this puzzle requires engaging in a mental operation in which
reading and
looking are inextricably linked. Yet this
reading/looking process is not exclusively goal-directed. At the same time that the object
descriptions provide the player with the information necessary to solve the puzzle, they
also help the player to imagine both the visual appearance and the affective resonances of
the objects referenced. As this example suggests, affective ekphrasis can be a technique
of both puzzle-solving and worldbuilding; using Douglass's distinction, it contributes to
both the “gamelike” and the “narrative” qualities of IF.
City of Secrets combines the emotional vividness of visual
prose with the ability to interact with the visualized world through an avatar, a
combination which is perhaps unique to IF. Instead of trying to match the transparent
visuality of the graphical video game, it provides an IF-specific experience of affective
textual visuality. This is a second possible way in which IF can define itself as an
artistically viable medium and not an inferior precursor to the graphical video game.
Ad Verbum and City of Secrets
adopt two opposing strategies for demonstrating the continuing value of IF in a
post-graphical age. Ad Verbum suggests that IF needn't try to
compete with the visuality of graphical games because IF's strengths lie in its nonvisual
aspects. City of Secrets, by contrast, demonstrates that IF
can be visual in a way which may be inaccessible to graphical games. What both games
implicitly argue is that even if IF games can't (or shouldn't) compete with the visual
transparency of graphical video games, the creation of IF games can still be a viable
artistic pursuit. The coming of graphics doesn't kill IF, but it does force IF to
adapt.
To summarize, I have argued that IF is an ekphrastic medium insofar as it provides the
player with a textual translation of the avatar's direct visual experience. Unlike
traditional ekphrastic poetry and prose, however, IF is prescriptively ekphrastic in that
it asks the player to perform concrete actions in response to its textual pictures. In the
post-graphical age, prescriptive ekphrasis becomes a threatened mode of visual
representation because computer graphics seem to have a superior ability to model the
diegetic world of the game. In order to justify the continued production of IF,
contemporary IF authors have adopted at least two strategies for responding to this
threat. The point of both approaches is to argue that IF offers players experiences that
graphical video games cannot match — an argument which ekphrastic poetry often implicitly
makes with respect to painting. Where the two approaches differ is in how they
characterize these experiences which are unique to IF. One strategy, as demonstrated in
Ad Verbum, is to abandon prescriptive ekphrasis and
concentrate on the purely textual experiences that IF can offer. The other strategy, which
we find in City of Secrets, is to employ an affective rather
than a mimetic mode of ekphrasis, thereby creating emotional effects that would be
difficult to replicate with graphics.
Even the first strategy, however, is still predicated on the visual properties of the IF
genre. Despite claiming to present a world composed purely of linguistic signifiers,
Ad Verbum still structures those signifiers according to a
world model composed of rooms and objects, and such a world model, as I've argued, must be
visualizable in order to be navigable. In City of Secrets,
visualization of the world model becomes the primary appeal of the game. To differing
extents, both texts ultimately offer the player the opportunity to collaborate with the
author in imagining a world. As the product of the player's affective visualization, this
world is, at least ostensibly, more intimate and personal than the vivid, transparent
worlds of commercial video games can possibly be. If authors like Montfort and Short still
write IF, and if players like me still play it, then this testifies to the existence of a
desire for spatial and visual experiences which are more imaginary or affective than
transparent. Regardless of the vivid immediacy of the spaces that graphical video games
allow us to inhabit, we still want to inhabit spaces which, to quote the inscription on
the living room door in Zork, are intentionally left
blank.