Imagining the Continuously Present Past:
Visualizing William Faulkner’s Narratives and Digital
YoknapatawphaJohannes BurgersAshoka Universityjohannes.burgers@ashoka.edu.in
Johannes Burgers is an assistant professor of English and Digital
Humanities at Ashoka University, New Delhi. He is also an associate
director for the
Digital Yoknapatawpha (DY)
project (http://faulkner.iath.virginia.edu/) – a collaborative, online
resource for exploring William Faulkner’s Yoknapatawpha fictions through
deep maps, audio recordings, historical
photographs, archival materials, and other visualizations. The portal
was created by an international team of Faulkner scholars and educators
with the support of technologists at the University of Virginia’s
Institute for Advanced Technology in the Humanities. He provides
critical commentaries for the site, and also designs new types of
visualizations. This digital work has served as a wellspring for more
traditional print scholarship in venues like Cultural Analytics, Mississippi
Quarterly, and a forthcoming piece in The Norton Critical
Edition of Absalom, Absalom!. More broadly,
his digital scholarship is focused on rendering conceptual maps of
fuzzy humanities data using GIS.
Github repo: https://github.com/joostburgers/
Alliance of Digital Humanities OrganizationsAssociation for Computers and the Humanities000548015215 June 2021article
This is the source
DHQ classification scheme; full list available at http://www.digitalhumanities.org/dhq/taxonomy.xmlKeywords supplied by author; no controlled vocabularyFigures 5 and 6 were incorrect;
author sent in correct images after publication.
Hosted out of the University of Virginia and funded by the National Endowment for
the Humanities,
Digital Yoknapatawpha is an
international and collaborative project composed of William Faulkner scholars
and technologists. Its goal is to create a comprehensive database of all the
locations, characters, and events in Faulkner’s Yoknapatawpha fictions with the
aim of visualizing the data through a series of deep
atlases and other displays. This paper traces the development
cycle of a supplementary narrative structure analysis dashboard that allows
users to explore the chronology, narrative status, and date range of all of the
texts set in his mythic county. In doing so, it bridges some of the significant
gaps between narratological theory and computational methods, opens up a
conversation about representing narrative data, and suggests some possible
avenues for research with the dashboard.
This article discusses the Digital Yoknapatawpha project held at the University
of Virginia.
Introduction
A story, simply put, is a sequence of events. This definition has been in place
at least as early as Aristotle’s
Poetics. What is
much less simple to define is the constituent terms: sequence and events.
Accordingly, schools of narrative theory have conceptualized events and their
connection to sequentiality from radically divergent vantage points, ranging
from defining the varied shapes of the sequence, their value at a semantic
level, with regard to events’ salience to the overall plot, their relationship
with narrative point of view, and the interpretation of events by the reader,
just to name a few .Of these, the classic work is Gérard Genette’s
Discourse du récit, which has been
productively mined by literary scholars and computer scientists for his
insights into narrative. For an overview of a semantic analysis see: . Blair Labatt uses event salience for a
narratological analysis specifically based on Faulkner’s work . Joseph Reed’s work is an early attempt at using
narrative theory to analyze Faulkner. It is somewhat idiosyncratic in its
outlook though . James Phelan and Peter J.
Rabinowitz’s critical introduction to narrative theory provides an excellent
overview of rhetorical, feminist, mind-oriented, and antimimetic approaches
with regard to time, plot, and progression .
Other definitions include Leitch, who sees plot as a function of the a
teleological principle . Indeed, as more
recent work on narrative shows, what narratives are and how they can be
decomposed is anything but a settled matter .
Despite these definitional challenges, researchers have been productively using
narratological concepts to analyze texts since the early days of computing. In
his exhaustive and insightful overview of the state of the field,
Computational Modelling of Narrative, Inderjeet Mani
covers the various approaches taken by computer scientists to decompose and
generate narratives, while introducing an innovative mark-up language of his
own: NarrativeML . With applications in language
domains far outside literary studies and poised to disrupt everything from
economics to medicine to forensics, work within the field of computational
narratology has continued at break-neck speed. One recent work even claims that
computational narrative generation has arrived at the moment of post-narratology
.
Despite this substantial scholarly footprint and exciting advances, [t]he flow of influences has historically been from
narratolology to computation, as Mani already lamented in 2013 . He lists narratologists’ concern with discourse over
fabula, and the relatively rudimentary quality of narrative generation, as
causes of this one-way traffic. In addition to these, two large deterrents to
the more widespread adoption of methods in computational narratology are the
inter-related issues of scale and scalability. The first denotes the semantic
level at which events are parsed. To be insightful for narratology, texts need
to be broken down into meaningful narrative units that are more capacious than a
predicate and more precise than a summary. Relatedly, the process needs to be
scalable to a wide-range of texts to facilitate for insightful comparison. Up
until now, the challenging trade-off has been that solutions at a scale
appropriate for narratological analysis are not scalable because they require
laborious human intervention, and unsupervised solutions are scalable but not at
the right scale for narratological analysis. This is perhaps why projects that
try to innovate narratology through computational methods are often mired in the
proof-of-concept phase. Disappointingly, this has meant that the potential of
all this groundbreaking work remains untapped by the larger community of
literary scholars.
One single-author study that has managed to scale by dint of years of laborious
coding, consistent revision, and substantial institutional support is the
Digital Yoknapatawpha project (hereafter DY).
Hosted out of the University of Virginia, DY was created through
the hard work of over thirty Faulkner scholars putting in thousands of hours to
code all of the locations, characters, and events in his Yoknapatawpha fictions
into a relational database. Nearly a decade in the making, the site enables
students, teachers, and scholars to explore fourteen novels and fifty-six short
stories through a series of deep atlases based on maps
Faulkner drew in 1936 and 1945 (http://faulkner.iath.virginia.edu/). The main interface is
supplemented by a wealth of materials including: manuscripts, archival audio,
historical photographs, textual commentaries, and other data visualizations.
Though the project will continue to grow, DY is now robust enough
to use as a scholarly tool, and several members of the team have already
leveraged it to highlight new aspects of Faulkner’s writing .
The large tranche of highly-curated narrative data made available through
DY offers an opportunity for computational narratologists to
generate hereunto unimagined visualizations of narrative. This paper has far
more modest ambitions. Instead, it contends that the most productive and
intuitive approach to visualizing the shape of narrative is one that displays
chronological order versus story order. This approach was actually devised by
the Russian Formalists at the beginning of the twentieth century, and has been
returned to time and again by digital humanists. By drawing on visualizations
created for Digital Yoknapatawpha (http://faulkner.iath.virginia.edu/narrativeanalysis.html) and other
adjacent digital work, I demonstrate the comparative power of such charts and
the scalability of the method across different types of fictional texts. As
such, the goal of this paper is to open up a larger conversation between digital
humanists and narrative theorists, and consider the best practices for
translating narratological concepts into encoded narrative data that can be
productively visualized. A principal part of this conversation is the necessity
of establishing a meaningful and shared visual language that clearly represents
fundamental narrative concepts to a broader audience. Much of this language has
already been in place for some time, but it has been scattered across different
knowledge domains.
Bridging Humanities and Computational Narratology
Historically, there have been two important observations about narrative. The
first, by Aristotle, is that a narrative has a beginning, middle, and an end,
and that these are usually connected through causality. The second, by the
Russian Formalists, is the separation of the fabula (story
material) from the syuzhet (the way the author shapes the
material), an insight mirrored concurrently by E.M. Forster who more strictly
defines the separation as one between story and plot . As self-evident as these two distinctions are,
operationalizing them into a hermeneutic for textual analysis is fraught with
problems. As Gerald Prince aptly points out, Narrative
sequences…are semantic and not semiotic in nature. That is to say, narrative sequences are not easily
broken down into constituent parts based on linguistic and logical properties.
The scope and length of individual narrative events are influenced by a whole
host of factors, including rhetorical devices, extra-textual and inter-textual
connections, figures, tropes, irony, narrative frequency, narrative speed,
narrative authority, and narrative reliability to name a few . In this, it is hard to ignore the lessons of
deconstruction and post-structuralist narratology . Any attempt to grab a single narrative thread tugs at the entire warp and
woof of a text’s intertextuality. Delimiting a text into discrete units always
imposes an artificial structure from without. This is to say nothing of the
problems that arise when decomposing narrative if the experience of the reader
is considered. After all, from a functionalist approach, narrative needs a
reader to (re)-constitute it, and, therefore, narrative sequence is not a type
of deep structure, but rather something that exists
in discourse. John Pier observes that for functionalists, sequence is assimilated into the broader question of intersequentiality and
the dynamic relations occurring between the telling/reading and the
told. Which is to say, the sequencing of events in a text
does not exist external to a reader. Thus, while it might be self-evident that a
narrative sequence is a series of events, what constitutes those events and how
they are constituted is remarkably difficult to pin down.
Translating narratological insights into computational methods poses a
substantially different set of challenges. A major hurdle is computationally
reproducing the cognitive faculties that allow human readers to understand texts
with remarkable sophistication and accuracy. To decompose a narrative at any
level of competency, a computer has to perform a series of inter-related and
complex tasks including: natural language processing , spatial organization [Tenen 2018], narrative parsing
, and understanding character entities , to name but a few. Moreover, much of the work that
has been done on the computational understanding of narrative falls outside the
ambit of literary studies, and has focused largely on corpora within specific
and, often, bounded knowledge domains, including, among others, economics , law enforcement , medicine
, education ,
legal studies [Mahfouz et al. 2018], and reconstructing news narratives . This research is valuable for the insights it
offers, but is not directly applicable to literary studies, because so much of
literary production traverses multiple knowledge domains and deliberately
subverts anticipated text structures. More simply, it is more probable that the
language, style, and format of two fictional texts are more dissimilar than two
medical reports, legal briefs, or, even news reports. This dissimilarity makes
it challenging for an unsupervised computational approach to establish and
detect patterns that can be iterated across a large corpus with a high degree of
consistency.
Still, there have been significant attempts to generate a functional model for
doing narrative analyses using digital methods. For example,
PlotVis was a tool developed by a team at the University of
British Columbia. It visualized narratives encoded in Extensible Mark-up
Language (XML) and could be “customized by the teachers and students in order to
accommodate various interpretations of a single piece of fiction” . Operational from 2013-2016, heureCLÉA
used human-annotation and machine learning techniques to produce a corpus of 21
annotated short stories, it was supplemented by the textual annotation and
analysis tool CATMA. Using the data from heureCLÉA, another team designed
Narrelations, an application that visualizes multiple levels of
narrative . Meanwhile, Mark Finlayson created the
ProppLearner corpus by annotating fifteen folk tales. Once
annotated, the training data was used to analyze the morphology of different
folktales using Propp’s method. Yet, even here a significant amount of human
intervention was required to make sure that events were parsed properly . Indeed, as exciting as the results of the study
were, the data collection process was necessarily labor-intensive and costly
.Finlayson’s more recent related work is
extremely promising in its potential to parse sub-events from narratives
.
Other recent investigations of narrative have relied more on natural language
processing or some form of machine learning to aid with the parsing of text.
Among these is the aptly named, Syuzhet package, developed
by Matthew Jockers and available on CRAN (Comprehensive R Archive Network). The
package allows users to draw on four different sentiment dictionaries to score
the overall sentiment of a text or to chart how positive
and negative sentiments are activated across the text. The advantage of this approach is that text
analysis can be automated. While Jockers’s work has sparked a lot of interest,
it has also received some criticism . One issue is
that his use of the word plot has no bearing on the concept
of chronological order of events, but rather how different sentiment structures
are articulated across a text. An alternative approach taken by Koichi Takeuchi
uses a predicate-argument structure thesaurus to determine narrative states,
actions, and change-in-states. This allows for multi-dimensional relations between predicates with their arguments
containing relations between the change-of-state and its goal. Though Takeuchi’s study is geared towards
automatic narrative generation, it can also be used to decompose a narrative.
Being able to detect a change-in-state in a predicate is useful for
understanding the syntactic building blocks of a narrative, but is too granular
for addressing narrative changes in a text for narratological analysis. Another
highly promising system under development is Yarn, which uses
Hierarchal Task Network (HTN) planning to generate visualizations of possible
storylines . Still, here too the plot composition is
presently too crude to be able to meaningfully differentiate between narratives
for the purposes of narratology.
Finally, there has been a spate of projects that use film as a basis of narrative
analysis. As Eric Hoyt et al. explain, [o]f all narrative
forms, the motion picture screenplay may be the most perfectly pre-disposed
for computational analysis. This is because the text is already semi-structured
with characters identified by dialogue and narrative units separated into
scenes. Since the process of narrative analysis of scripts lends itself to
automation, a number of analogous projects have emerged independent of one
another that generate a visualization of narrative progression over time. These
include work done by Sharma and Rajamanickam ,
ScripThreads, and Story Curves. The underlying assumption with all of these projects
is that the script scene sequence is consonant with the plot event sequence.
From a strict narratological perspective this need not always be the case.
Events can also be conveyed to the audience through diegetic and extradiegetic
elements, as, for example, when a character reveals his or her backstory through
dialogue with another character, or when there is a voice-over that relates past
or future events. Presumably, these events occur at a different date than the
chronological order of the scene. To account for this discrepancy in any
automated parsing method adds a substantial level of complication. While this by
no means invalidates such projects, it merely underscores that the question of
the appropriate scale of a narrative event is still unsettled.
In sum, the twinned foundational challenges facing a more widespread adoption of
computational narrative analysis are those of scale and scalability. The
projects that rely on manual encoding are generally fine-grained enough to
provide consistent narratological insight, but the labor involved does not scale
well. Meanwhile, automated methods tend to be too course-grained to allow for
meaningful comparison between texts, or, conversely, they provide analysis at
the level of the predicate, which is not functional for narrative analysis. As
Prince points out, narratologists agree that narrative
sequences represent linked series of situations and events and fur¬ther
agree that these sequences can be expanded or summarized, that they can be
combined with other sequences in specifiable ways such as conjunc¬tion,
embedding, or alternation, and that they can be extracted from larger
sequences. This type of narrative modularity can only be
achieved if event units contain more than predicate-level information, and
account for a unified ontology of space, time, and character within an event,
while at the same time being more specific than a summary of the text. While
there is no precise determination as to what this scale might be, the method
developed for DY provides a flexible and reproducible
framework.
Punctuating the Long Sentence: Event-Driven Narrative Encoding in
Faulkner
Faulkner Studies has always had a special relationship with narratology. Because
Faulkner’s texts are so narratively intricate, one consistent topic of
exploration has been dis-entangling his plot lines and understanding his use of
time, so much so that it is somewhat of a cottage industry . With the advent of digital technologies this
exploration has continued unabated. John Padgett’s
William
Faulkner on the Web still provides a rich resource for Faulkner
novices and experts alike [Padgett]. The Sound and the
Fury: a Hypertext Edition by Stoicheff, Muri, Deshaye, et al. was a
highly innovative project that tackled the problem of visualizing one of
Faulkner’s most challenging narratives as early as 2003 . Before DY, the most sophisticated visualization of a Faulkner
narrative was actually the 2003 Adobe Flash-based chronology of Absalom, Absalom! created by current DY director,
Stephen Railton .Unfortunately, neither
digital text currently functions. Due to copyright issues the full hypertext
created by Stoicheff, Muri, Deshaye, et al. is no longer available. Though
it is still a good resource for digital renderings of the text. In fact, a
graph created by Kathleen Murphy of narrative time in the Benjy section of
Sound and the Fury shares many similarities
with the chronology graphs this paper presents .
Likewise, the chronology of Absalom, Absalom!
created by Railton and Rourke no longer functions because it was coded in
Flash. Currently, attempts are being made to revive the chronology in a new
format.
In its sheer openness and scale, DY supersedes these early
Faulknerian projects, while simultaneously being heavily indebted to them. The
data currently available represents nearly five-thousand character records, over
two thousand locations, and more than eight thousand events, each with their own
individual attributes. Aggregated, these data tables represent around a
quarter-million data points across several dozen different data fields.As
of this writing the exact numbers are 2,152 locations, 4,988 characters, and
8,435 events, though these are always subject to change. This data
drives the main interface, but can also be used to create alternative
visualizations. For example, Raphael Alvarado designed a platform to generate
force directed graphs that show the co-occurrence of characters and locations
.
Included in this data set are also a number of variables that allow for the study
of Faulkner’s narrative: chronology, narrative status, and event dates. In order
to arrive at these more abstracted data points, all of Faulkner’s Yoknapatawpha
fictions had to be entered into a relational database containing the entities:
Text,Locations,Characters, and Events.Texts contains the individual
Locations and Characters for that
text, and Events are the combination of a character or
characters at a location for a unified action.The database also contains a
location key that keeps track of all the locations being used across the
corpus, but this does not have any bearing on the research here. An
overview is provided in the entity relationship diagram below (see Figure 1): While entering all the locations and characters has been by no means
uncontentious, encoding events has presented some of the most difficult
theoretical and practical challenges. Since a database can only store discrete
information, delimiting events, which tend to be non-discrete, necessarily
requires interpretation because the boundaries that separate beginnings and
endings are unclear and subjective . In an ideal
scenario, events are entered at the same level of granularity with the same
consistency across the corpus. Impinging on this ideal are practical
considerations. Digital projects that require manual encoding quickly run into
limitations like data collection scope, the labor available, and the possibility
of introducing human error. The slightest change in the definition of event
boundaries can exponentially increase project completion time. An example is
instructive here. Faulkner’s
Red Leaves, is a short
story about the ritualistic burial of Issetibbeha that requires a
hunt of his body-servant who is to be buried next to
him. In the following passage, the African American servant is returning to the
burial site and along the way he comes across one of the Chickasaw:[A](1) In the middle of the afternoon he came face to
face with an Indian. (2) They were both on a footlog across a slough —
the Negro gaunt, lean, hard, tireless and desperate; the Indian thick,
soft-looking, the apparent embodiment of the ultimate and the supreme
reluctance and inertia. (3) The Indian made no move, no sound; he stood
on the log and (4) watched the Negro plunge into the slough and swim
ashore and crash away into the undergrowth. [B](5)Just before
sunset he lay behind a down log .
There are many ways to divide up the events in this text. The
first is to say that the entire paragraph and the following sentence constitute
an event because it starts in the middle of the afternoon and goes till sunset.
This is a period of several hours, and should therefore be considered one
extended event: the servant running away. The other option, [AB], uses the
paragraph division to split the passage in two and considers [A], the meeting
with the Indian, and [B], lying behind a log, as two
different moments. The last option would be to divide every individual action
and description into an event (1,2,3,4,5). This approach appears ideal from a
data perspective, because it gives the greatest amount of detail. The drawback
is that entering five different records is much more laborious than the first
approach. Since, each event record requires the entry of thirteen variables, the
difference is between entering thirteen or sixty-five data points. Scaled to the
level of the text, a story that might take thirty hours to encode suddenly takes
a hundred and fifty hours. As each story in the DY database is also
peer-reviewed and subsequently curated for additional data entry, changing the
scale also increases the labor required downstream in the production cycle.
For DY, the scale of an event is defined as, 1 setting, 1
unbroken length of time, 1 main focus and 1 narrative style is 1 continuous
Event. This definition has remained remarkably durable
throughout the production process. The narrative is capsulized into units that
are intelligible as self-sufficient ontologies. An event can therefore be
de-contextualized from the larger narrative, and still make sense as an action
performed by a character or characters at a particular place. Needless to say,
applying this definition consistently throughout the encoding process does not
happen without debate. Usually, the bone of contention is whether an event
remains one continuous action when a key character enters or exits a location,
and how to break up events where characters are travelling for an indeterminate
amount of time and space. Notably, DY’s definition of an event uses
page number as a proxy for discourse time (the amount of time it takes to read
the passage) and does not measure story time, (the duration of the event in the
text) . An event may span several pages and describe
an action that happens in an instant, or, conversely, may only be one sentence
and take years. Thus, it is possible to show the order of events, but not the
narrative pacing that dictates that order.
Along with delimiting events, another challenge is ordering them. For the
purposes of the main visualization, the events need to be placed in
chronological order. There are few literary texts that adhere to a strict
chronological order, and Faulkner’s are certainly not among these. This makes
for great reading, but challenging encoding. Fortunately, while many of
Faulkner’s events are narrated out of order, they do follow an underlying total
order .
This is not always the case though. There are occasionally
orphan events that cannot be unambiguously slotted into
a chronology. For example, in their meticulous chronological study of
The Sound and the Fury, George Stewart and Joseph
Backus document three, relatively minor, orphan events. In their opinion, the matter is too minute to warrant further discussion. Though this may certainly be true of their
research, it presents a significant database entry problem for DY.
These events must be entered somewhere so they can be sequenced in the
animation, and whatever position they are given changes the total order of the
chronology.This approach precludes the possibility of simultaneity,
something that other storyline visualizations are able to do.
On occasion, computation was used to provide a rough sort of events, but, by and
large, the final chronologies for each text were ultimately the result of
textual scholarship. It should be noted that sorting events through computation
is possible, however; as Burg et al., have shown in their work using constraint
logic programming to infer the most probable order of events for
A Rose for Emily.Their findings are intriguing as a proof of
concept for constraint logic programming, though they readily admit that
perhaps to understand Emily, we must give up our
orderly sorting of experience. That is to say, that the
fuzzy dating of the text may not be a puzzle to be
solved but a confusion to be experienced. After all, it seems unlikely that
Faulkner hid within his story an obscure dating and chronological system
that he wanted to be discovered through an advanced programming language
seventy years later. Unfortunately, computational sorting cannot
account for any authorial irregularities in a plot. As Faulkner likely did not
anticipate that his fiction would be manually encoded by a group of scholars for
the purposes of visualization, he did not iron out plot timeline
inconsistencies. Perhaps the most famous example of this is his work on The Mansion. For the final drafts of the novel,
Faulkner was extremely reluctant to correct timeline discrepancies with his
previous novels and adjust misalignments with historical events . Any system that relies on causality and strict date
ordering would run into trouble sorting this text.
Stepping back, it is clear that the
DY data is the
result of interpretation, collaboration, and compromise. Nevertheless, part of
the success of the encoding process has been the intuitive ease of data entry
achieved by framing the data within fundamental narratological concepts.
Annotators with various levels of technical expertise could be onboarded and
taught to encode by working on a short-story that was then checked for
consistency. On average each text event coding went through seven passes. The
initial pass of hand-coding the events in the paper copy of the text,
transcription to a spreadsheet, sorting the events by chronology in a
spreadsheet, entry into the database, peer-review by a fellow database editor,
another review by the director or one of the associate directors, and another
final review for inconsistencies before being brought online. This very
intensive vetting process reduces the possibility of human error and
inconsistency, but, of course, cannot guarantee its elimination. Importantly,
one of the valuable lessons about the encoding process is that the data should
only go live once all errors have been removed. Once an erroneous entry makes
its way into the database it is very hard to discover and revise. Furthermore,
as all the encoders were Faulkner scholars, there was a continuous temptation by
editors to introduce ever more narrative features to capture the richness of
Faulkner’s writing. Needless to say, this would have meant continuously recoding
eight thousand different entries representing six thousand pages of narrative
text, significantly adding to the prospective workload. Such efforts were thus
held in abeyance until the completion of the initial encoding of all the texts.
Currently, the DY team is adding an additional level of nuance to
each event by labelling each event with keywords. This process has greatly
profited from the fact that the event structure is already in place.
Taking the narrative data and turning it into visualizations presents its own
suite of challenges. There are no established conventions for representing
digitally encoded narratives. Without standard practices, it is hard to
interpret and compare different narrative data visualizations. In this regard,
Faulkner’s texts are particularly tricky to visualize because he experimented
with narrative structures throughout his career. Scholars have described his
complex narrative techniques as everything from the frozen
moment to enclosure of past, present, and future . Reflecting on his writing, Faulkner once famously
conceived of the past and present as one long sentence: There is no such thing really as was because the past
is. It is a part of every man, every woman, and every moment. All of his
and her ancestry, background, is all a part of himself and herself at
any moment. And so a man, a character in a story at any moment of action
is not just himself as he is then, he is all that made him, and the long
sentence is an attempt to get his past and possibly his future into the
instant in which he does something. Faulkner’s theory of the long sentence and the continuously present past,
represents a fundamental complication for visualization. A faithful rendering of
his vision would have to project the narrative past and the narrative present on
top of one another, which would, at best, lead to an amorphous blob. On the
other hand, showing events as a discreet sequence does not do justice to the way
Faulkner weaves the past through the present in his work.
A related challenge is that Faulkner’s idea of the past being eternally present
led him to experiment with time throughout his career; arguably each of the
novels is a reworking of the same theory in a different form. Ideally, a
visualization that captures Faulkner’s use of time has to be consistent across
all his works to provide a basis for comparison, but also flexible enough to
capture the idiosyncrasies of each text. For instance, the
Sound and the Fury and Absalom,
Absalom! are both about the past, but [o]ne is
a drama about knowing events, the other a drama of events defused and
disconnected. These differences in the way time is constructed
are compounded when expressed across the entire corpus. Any visualization of his
narrative, including the ones provided by the narrative structure analysis
dashboard, can only provide a provisional and limited insight into the
texts.
One final hurdle to visualizing narrative in Faulkner is that the unique
narrative problems his texts present may not necessarily be useful for creating
a shared visual language with other, non-Faulkner texts. This is an obstacle
that narrative theory is particularly adept at surmounting. After all, there are
certain generalizable textual aspects based on narrative theory that can
elucidate cross-author comparison.
Matchless Times: Visualizing Faulkner’s Plots
At first blush, it would appear that there is no accepted standard for
visualizing narrative data. This is only because the common means of doing this
have been scattered across different knowledge domains. In fact, a number of
different projects have proposed analogous models, even if they were not
necessarily aware of one another. The most consistently used model relies on
graphing the chronological order in relation to story time. This technique was
already anticipated by Vladimir Propp, and he hints at versions of the models in
the appendix of
Morphology of the Folktale in 1928
. Earlier work by Faulkner scholars has also
adopted the same technique of contrasting plot with story . The storyline models created by Randall Munroe on
xkcd.com have inspired much productive subsequent research , and are similar to Propp’s initial insights,
even if he is not mentioned. Thus, when Kim et al. claim that their project
Story Curve is the first scientific investigation and
systematic exploration of this visualization technique, it is
perhaps somewhat overstated . The deep parallels between
current work and that of Propp nearly a century earlier should not be seen as
evidence of stagnation within in the field, but rather as a testament to the
intuitive power of this type of visualization design.
Respecting this observation, I used the plotly.js graphing library and vanilla JS
to design a data dashboard that would allow users ranging from first-time
readers of Faulkner to seasoned scholars to compare his texts with one another
on the basis of plot shape. As
DY caters to the
broadest possible audience, users have different levels of technical
proficiency, and the usability of the charts needed to be immediate and
intuitive. The design goal was therefore to create something that required very
little input from the user, and generated insights that would be familiar and
meaningful to literary scholars. The resulting interface allows users to compare
up to four charts, and toggle between chronological order and date range. The
charts also have some functionality that is native to plotly.js that enables a
more scoped view of the data, and the ability to download the chart. Along with
the main narrative structure chart, there are also charts showing the percentage
of different forms of narration; the ratio of flashbacks, flashforwards, and
linear event sequences; and a frequency diagram of events across the dates of
the story. The various tools all work together to allow users to compare the
narrative structure of fourteen novels and fifty-six short stories.
Understanding Chronology through Plot Shape
The step charts below represent the progression of the story relative to the
order in which it is told. They are meant to showcase the plot structure of
a text in one, easy to compare, view. Each line segment represents one
event. Events that comprised less than a page come across as a dot, while
events that amounted to multiple pages are line segments, though for novels
most line segments appear as dots due to scaling. The length of events only
captures their discourse time, and does not correspond to their significance
or textual duration. These segments are plotted by page number on the x-axis
and chronological order on the y-axis.The reason for choosing page
number, which is dependent on edition, versus word count, which is
independent of layout and more precise, was a practical one. Only in
some cases was there a clean digital version available. In the future,
this data will eventually have to be linked to word count as well as
page number. When events happen earlier, they are lower down.
Conversely, events that happen later in the chronology are higher up on the
chart. If the story and the plot coincide, and events are told in the order
that they happen, the slope is forty-five degrees. This is rarely the case
for Faulkner, or any author. Instead, there are usually troughs or peaks in
the line. These are indications of analepsis (flashback) or prolepsis
(flashforward), respectively.Nonlinear sequencing is not always a sign
of prolepsis or analepsis can also be a sign of event parallelism. Since
the events have to be entered in rank order, events that happen at the
same time are forced into a linear sequence. Borrowing from an
example by David Herman, if story and plot coincide, the order of events
plotted out as a line segment is ABC (see Figure 2). If Faulkner tells the
story in a different order and uses a flashback, the order is BAC. Visually,
line segment A will come sequentially after B on the x-axis, but appear
lower down on the y-axis (see Figure 3). In that same vein, if there were a
flashforward the story would be told ACB and segment C would precede B on
the x-axis, but be higher up on the y-axis (see Figure 4) .
Along with chronological data, DY also identifies the narrative
status of an event. Narrative status answers the question: Who/what is responsible as the source of this
Event?. This concept does not map neatly onto any
language available in narrative theory, and using any adjacent definition
likely leads to more confusion than clarity. Be that as it may, each event
is identified as having one of five narrative statuses: 1)
Narrated by a first or third person narrator; 2)
Told by a character in the story; 3)
Remembered through a character’s consciousness; 4)
Hypothesized when something might have happened; 5)
Narrated+consciousness when Faulkner combines
narration with stream of consciousness. The last case was created
specifically for
Light in August, though
examples appear sporadically throughout other texts. Using these categories,
the chronological data can be subsetted according to narrative status.
Encoding narrative status for a text is not a necessary compotent for
creating storyline charts, and the applicability of the narrative categories
beyond Faulkner remains to be seen. It would be challenging, for example, to
encode James Joyce’s Ulysses with purely these
type of narrative status classifiers. Nonetheless, the resulting
visualization shows events as both a change in location or character, and as
a change in narrative status (Figure 5).
The above visualization compresses a lot of complex information. On the
y-axis is the rank order of chronological events, and on the x-axis are the
page and event number. The plotly.js library allows users several ways to
drill down and manipulate the data. The legend is interactive and users can
hide and show the various traces. Users can also hover over points to reveal
specific information like rank, page and event number accompanied by the
first 6-8 words of the event.
In this particular view of the
Sound and the
Fury several salient features of the text are visible. First,
Benjy’s chapter is the first in the novel, but is actually meant to be third
sequentially. This is clear visually because the first series of yellow dots
up until page 74 can be slotted into the space on the y-axis between 264 and
265. This is not particularly revelatory since the chapter titles are dated,
but it does confirm that the sequence is in the right order. Second, many of
the details in the narrative present are actually quite regular in their
sequence, it is the past that appears confused and is frequently nonlinear.
Upon closer inspection, even here there are patterns in the sequence. In
terms of narrative status, it is interesting to note just how much of the
Sound and the Fury is told through memory.
The first two chapters are recounted to a large extent to through the
memories Benjy and Quentin have. Meanwhile, in the last two chapters the
event sequence is far more linear, and the narrative status is more
consistently narrated than the previous two sections. The novel appears to
move from chaos and disorder to order and stability, and it is perhaps all
the more poetic that the novel ends with each in its
ordered place.
Interpreting Temporal Positioning through Date Range
Along with chronology and narrative status, the date range during which
events occur is another insightful way to visualize these stories. What is
particularly revealing about the date information is how Faulkner structured
the past in his narratives. At times, this is very precisely defined, as
with the date titles of the chapters in
The Sound and
the Fury. In other instances, time is very vaguely indicated.
The DY team enters an exact date whenever possible, but resorts
to a date range when there is insufficient information for a specific date.
Sometimes the range is a week, a month, or even a couple of years. In order
to identify and delimit date ranges as much as possible, textual references
to real historical events, such as the Louisiana Purchase or the Battle of
First Manassas, are used as anchors to which other relative dates are
tethered. Even this is not always possible, since relative time indications
like earlier and later have no
specific value. In such cases, it is indicated that the date range is
indeterminate.
The aforementioned challenges with establishing the dates for events
notwithstanding, it is still possible to visualize them in a productive
manner. The graph below shows the latest possible date for each event
indicated by a red line and the earliest possible date with a blue line (see
Figure 6). The area in between represents the range of dates possible. As
opposed to the chronological graphs, these have a continuous line because it
provides more visual clarity than a segmented line, even if it might give
the false impression that there is a smooth transition between past and
present.
To a certain extent, the date range graphs parallel the chronology graphs,
because both order the events relative to page number. The key distinction
is that the date range charts demonstrate the difference in time between
events. The chronological graphs are insightful with regard to the
sequencing of time; the date range graphs give a better sense of how
Faulkner is using historical time.
In the chart of
Absalom, Absalom! this use of
the past is particularly powerful. Even though the narrative present is
1909-1910, many of the events take place at an earlier date. Looking across
the graph, it is possible to see that the past predominates the beginning of
the novel, but over the course of the text the past and present grow
together, until, finally, they almost intertwine at the end. The chart
demonstrates just how closely interknit past and present are in Faulkner’s
imagining.
Conclusion
While it is tempting to make inferences about Faulkner’s work using the
narrative structure analysis dashboard, those conclusions have been deliberately
forestalled here. Instead, the goal of DY is to provide a platform
for other scholars to use the data for their own work. It makes little sense to
create a dynamic user-driven visualization if it is only meant to lead to
predetermined answers. That being said, there are a number of fruitful areas of
investigation that the dashboard makes possible. The first of these is to see if
there is a pattern in the way Faulkner structures the plots of his texts. One of
Faulkner’s central concerns is how the past inhabits the present, but it is
unclear if he does so in continuously new ways or if there is a signature
Faulknerian style with which he works and reworks this
material. A related issue is the change or consistency in his use of chronology,
narrative status, and date ranges throughout his career. In Faulkner studies,
there is a generally accepted arc of Faulkner’s development that divides his
literary output into several distinct periods. It would be interesting to see if
this structured division of his writing is visible in the visualizations or if
these tell a different story.
Beyond Faulkner, the visual language used in this paper and the accompanying
dashboard are prompts for a larger discussion about representing and
interpreting digitally encoded narratives. The framework for narrative analysis
provided by DY is precise enough to highlight meaningful
differences, while being broad enough not to be limited to Faulkner. The visual
language used for the dashboard is similar to projects within disparate
knowledge domains, suggesting their intelligibility. A broader adoption of
plot/story charts would usher in the ability to start comparing different
authors on reasonably equal narratological footing. One obvious point of
contrast for Faulkner is a contemporaneous author like Ernest Hemingway, many of
whose short stories take place almost entirely in the narrative present. Yet,
the trauma of the past is always lurking just under the surface about to
explode. Similarly, it would be interesting to compare Faulkner’s writing to
that of realist authors like Kate Chopin, who more strictly adheres to the
conventions of chronological unity.
Whatever path researchers might strike out, narrative theory still provides an
indispensable map for conceptualizing, and encoding narrative data. No doubt,
this encoding is an artificial construct that does not reveal any type of
fundamental truth about a text, as early structuralist narrative theorists might
have imagined. Nevertheless, these rasters for interpretation slice up texts in
unexpected ways, providing new insights yet to be imagined, much less
visualized.
Aldawsari, M. &
Finlayson, M. Detecting Subevents Using Discourse and
Narrative Features. Proceedings of the 57th
Annual Meeting of the Association for Computational Linguistics,
July 2019 Florence, Italy. Association for Computational Linguistics,
4780-90.Alvarado, R. Characters and Locations in Force Directed Graph. Available: http://faulkner.iath.virginia.edu/characters-force.html?text_na=FD.Baber, C., Andrews, D., Duffy, T.
& Mcmaster, R. Sensemaking as Narrative: Visualization
for Collaboration. The 3rd International UKVAC
Workshop on Visual Analytics, 2011 London.Baroni, R. L. The
Many Ways of Dealing with Sequence in Contemporary Narratology. In:
Baroni, R. L. & Revaz, F. O. (eds.) Narrative Sequence
in Contemporary Narratology. The Ohio State University Press,
Columbus (2016).Baroni, R. L. &
Revaz, F. O. Narrative Sequence in Contemporary
Narratology. The Ohio State University Press, Columbus
(2016).Barros, C., Vicente, M. &
Lloret, E. Tackling the Challenge of Computational
Identification of Characters in Fictional Narratives. 2019 IEEE International Conference on Cognitive Computing
(ICCC), July 2019. 122-29.Bartalesi, V., Meghini, C. &
Metilli, D. Steps Towards a Formal Ontology of Narratives
Based on Narratology. OpenAccess Series in Informatics, 2016. 4.1-4.10.Bögel, T., Strötgen, J. &
Gertz, M. Computational Narratology: Extracting Tense
Clusters from Narrative Texts. Proceedings of
the Ninth International Conference on Language Resources and
Evaluation (LREC 14), May 2014 Reykjavik, Iceland. European Language
Resources Association (ELRA), 950-55.Brown, M., Dobson, T., Grue, D.
& Ruecker, S. Challenging New Views on Familiar
Plotlines: A Discussion of the Use of XML in the Development of a Scholarly
Tool for Literary Pedagogy. Literary &
Linguistic Computing, 28 (2013): 199-208.Burg, J., Boyle, A.
& Lang, S.-D. Using Constraint Logic Programming to
Analyze the Chronology in A Rose for Emily. Computers and the Humanities, 34 (2000):
377-92.Burgers, Johannes. Familial Places in Jim Crow Spaces: Kinship, Demography, and
the Color Line in William Faulkner’s Yoknapatawpha County. Journal of Cultural Analytics, 1 (2020).Chatman, S. B. Story and Discourse: Narrative Structure in Fiction and Film.
Cornell University Press, Ithaca, NY (1978).Delmonte, R. & Marchesini, G.
A Semantically-Based Computational Approach to
Narrative Structure. IWCS 2017 — 12th
International Conference on Computational Semantics — Short papers,
2017.Faulkner, W. Faulkner in the University: Class Conferences at the University of
Virginia, 1957-1958. Vintage, New York (1965).Faulkner, W. The Sound and the Fury. Vintage International, New York
(1990).Faulkner, W. Red Leaves. Collected Stories. Vintage International, New York
(1995).Finlayson, M. A. ProppLearner: Deeply Annotating a Corpus of Russian Folktales
to Enable the Machine Learning of a Russian Formalist Theory. Digital Scholarship in the Humanities, 32 (2017):
284-300.Fludernik, M. Histories of Narrative Theory (II): From Structuralism to the
Present. In: Phelan, J. & Rabinowitz, P. J. (eds.) A Companion to Narrative Theory. Blackwell Publishing,
Malden, MA (2005).Fócil-Arias, C., Sidorov, G.,
Gelbukh, A., Arce, F., Pinto, Singh, Villavicencio, Mayr, S. & Stamatatos.
Extracting Medical Events from Clinical Records Using
Conditional Random Fields and Parameter Tuning for Hidden Markov
Models. Journal of Intelligent & Fuzzy
Systems, 34 (2018): 2935-47.Forster, E. M. Aspects of the Novel. Harcourt, New York (1927).Frawley, W. Linguistic Semantics. Lawrence Erlbaum Associates, Hillside, NJ
(1992).Genette, G. 1972. Discours du récit.
Essai de méthode. Figures III. Paris: Seuil.Going, W. T. Chronology in Teaching A Rose for Emily. Exercise Exchange, 5 (1958): 8-11.Gutiérrez, G., Canul-Reich, J.,
Zezzatti, A. O., Margain, L. & Ponce, J. Mining:
Students Comments about Teacher Performance Assessment using Machine
Learning Algorithms. International Journal of
Combinatorial Optimization Problems & Informatics, 9 (2018):
26-40.Hansen, P. K., Pier, J., Roussin, P.
& Schmid, W. Emerging Vectors of Narratology.
De Gruyter, Boston (2017).Harris, P. A. Fractal Faulkner: Scaling Time in Go Down, Moses. Poetics Today, 14 (1993): 625-51.Herman, D. Story
Logic: Problems and Possibilities of Narrative. University of
Nebraska Press, Lincoln, NE (2002).Herman, D. Narrative Theory: Core Concepts and Critical Debates. Ohio State
University Press, Columbus (2012).Hoyt, E., Ponto, K. & Roy, C. Visualizing and Analyzing the Hollywood Screenplay with
ScripThreads. DHQ: Digital Humanities
Quarterly, 8 (2014).Inge, T. M. (ed.). A
Rose for Emily. Merrill, Columbus, OH (1970).Jockers, M. Introduction to the Syuzhet Package. Available: https://cran.r-project.org/web/packages/syuzhet/vignettes/syuzhet-vignette.html.Kim, N. W., Bach, B., Im, H.,
Schriber, S., Gross, M. & Pfister, H. Visualizing
Nonlinear Narratives with Story Curves. IEEE
Transactions on Visualization and Computer Graphics, 24 (2018):
595-604.Labatt, B. Faulkner the Storyteller. Universtity of Alabama Press, Tuscaloosa,
AL (2005).Lakoff, G. &
Narayanan, S. Toward a Computational Model of
Narrative. AAAI Fall Symposium - Technical Report, (2010).Leitch, T. M. What
Stories Are: Narrative Theory and Interpretation. Pennsylvania State
University Press, University Park, PA (1986).Leonid, B. Towards a Computational Measure of Plot Tellability. AAAI Conference on Artificial Intelligence and Interactive
Digital Entertainment; Thirteenth Artificial Intelligence and Interactive
Digital Entertainment Conference, (2017).Liu, S., Wu, Y., Wei, E., Liu, M.
& Liu, Y. StoryFlow: Tracking the Evolution of
Stories. IEEE Transactions on Visualization and
Computer Graphics, 19 (2013): 2436-45.Mahfouz, T., Kandil, A. &
Davlyatov, S. Identification of Latent Legal Knowledge in
Differing Site Condition (DSC) Litigations. Automation in Construction, 94 (2018): 104-11.Mani, I. Computational Modeling of Narrative. Morgan & Claypool
Publishers, (2013).Meister, J. C. & Geertz, M.
heureCLÉA: Collaborative Literature Exploration &
Annotation. Available: http://heureclea.de/.Munroe, R. 2009. Movie Narrative Charts. xkcd [Online].
Available from: https://xkcd.com/657/.Murphy, K. 2003. Sequential Display of Narrative Time in Benjy's Section. In:
Stoicheff, M., Deshaye, Et Al. (ed.). The Sound and the
Fury: a Hypertext Edition: U of Saskatchewan.Muzny, G., Algee-Hewitt, M. &
Jurafsky, D. Dialogism in the Novel: A Computational Model
of the Dialogic Nature of Narration and Quotations. Digital Scholarship in the Humanities, 32 (2017):
ii31-ii52.Nebeker, H. E. Chronology Revised. Studies in Short
Fiction, 8 (1971): 471.Nijila, M. & Kala, M.
T. Extraction of Relationship Between Characters in
Narrative Summaries. 2018 International
Conference on Emerging Trends and Innovations In Engineering and
Technological Research (ICETIETR), 11-13 July 2018 2018. 1-5.Ogata, T. Toward a
Post-Narratology or the Narratology of Narrative Generation. In:
Akimoto, T. (ed.) Post-Narratology through Computational
and Cognitive Approaches. Information Science Reference, Hershey, PA
(2019).Padgett, J. William
Faulkner on the Web. Available: http://cypress.mcsr.olemiss.edu/~egjbp/faulkner/faulkner.html.Padia, K., Bandara, K. H. &
Healey, C. G. A System for Generating Storyline
Visualizations Using Hierarchical task network planning. Computers & Graphics, 78 (2019): 64-75.Perry, M. Literary
Dynamics: How the Order of a Text Creates its Meanings [with an Analysis of
Faulkner's A Rose for Emily]. Poetics
today, 1 (1979): 35-64; 311-61.Phelan, J. Somebody Telling Somebody Else: A Rhetorical Poetics of Narrative.
Ohio State University Press, Columbus (2017).Pier, J. The
Configuration of Narrative Sequences. In: Baroni, R. L. & Revaz,
F. O. (eds.) Narrative Sequence in Contemporary
Narratology. The Ohio State University Press, Columbus, OH
(2016).Prince, G. On
Narrative Sequence, Classical and Postclassical. In: Baroni, R. L.
& Revaz, F. O. (eds.) Narrative Sequence in
Contemporary Narratology. The Ohio State University Press, Columbus,
OH (2016).Propp, V. Morphology of the Folktale. University of Texas Press, Austin
(1979).Railton, S. Instructions. Digital
Yoknapatawpha. Available: http://faulkner.drupal.shanti.virginia.edu/content/instructions.Railton, S. Manuscripts &c: The Mansion. University of Virginia. Available: http://faulkner.drupal.shanti.virginia.edu/node/18732?canvas.Railton, S. &
Rourk, W. Absalom, Absalom!
Chronology. University of Virginia.
Available: http://twain.lib.virginia.edu/absalom/index2.html.Railton, S., Towner, T. M.,
Burgers, J., Corrigan, J., Joiner, J. J., Hagood, T., Carothers, J. B. &
Cornell, E. Roundtable: Digital Yoknapatawpha.
Mississippi Quarterly, 68 (2015):
456-85.Reed, J. W. J. Faulkner's Narrative. Yale University Press, New Haven
(1973).Reed, R. The Role of
Chronology in Faulkner's Yoknapatawpha Fiction. The Southern Literary Journal, 7 (1974): 24-48.Richardson, B. Time, Plot, Progression. In: Herman, D. (ed.) Narrative Theory: Core Concepts and Critical Debates.
Ohio State University Press, Columbus, OH (2012).Richardson, B. A Poetics of Plot for the Twenty-First Century: Theorizing
Unruly Narratives. Ohio State University Press, Columbus, OH
(2019).Rio-Jellifffe, R. Obscurity's Myriad Components: The Theory and Practice of
William Faulkner. Bucknell University Press, Lewisburg
(2001).Robbins, B. Reading for Data: Temporal Speed Shifts in Faulkner's Death Drag and
the Process of Textual Digitization. Studies in
American Culture, 39 (2016): 7-20.Schwab, M. A
Watch for Emily. Studies in Short
Fiction, 28 (1991): 215.Schwan, H., Jacke, J.,
Kleymann, R., Stange, J.-E. & Dörk, M. Narrelations —
Visualizing Narrative Levels and their Correlations with Temporal
Phenomena. Digital Humanities
Quarterly, 013 (2019).Seonwoo, Y., Oh, A. &
Park, S. Hierarchical Dirichlet Gaussian Marked Hawkes
Process for Narrative Reconstruction in Continuous Time Domain.
Proceedings of the 2018 Conference on Empirical Methods
in Natural Language Processing, 2018 2018. 3316-25.Sharma, R. &
Rajamanickam, V. Using Interactive Data Visualization to
Explore Non-Linear Movie Narratives. Parsons
Journal for Information Mapping, (2013).Shklovskiĭ, V. Theory of Prose. Dalkey Archive Press, Elmwood Park,
IL (1990).Skei, H. H. Reading
Faulkner's Best Short Stories. University of South Carolina,
Columbia, SC (1999).Stewart, G. R. &
Backus, J. M. Each in its Ordered Place: Structure
and Narrative in "Benjy's Section" of The Sound and the
Fury. American Literature,
29 (1958): 440-56.Stoicheff, P., Muri, A., Deshaye,
J., Truchan-Tataryn, Bath, J., Mitchell, D. & Murphy, K. The Sound and the Fury: a Hypertext Edition.
Available: http://drc.usask.ca/projects/faulkner/.Swafford, A. 2015/03/02 2015.
Problems with the Syuzhet Package. Anglophile in Academia: Annie Swafford's Blog
[Online]. Available from: https://annieswafford.wordpress.com/2015/03/02/syuzhet/.Takeuchi, K. Thesaurus with Predicate-Argument Structure to Provide Base Framework to
Determine States, Actions, and Change-of-States. In: Ogata, T. &
Akimoto, T. (eds.) Computational and Cognitive Approaches
to Narratology. Information Science Reference, Hersey, PA
(2016).Tanahashi, Y. & Ma,
K.-L. Design Considerations for Optimizing Storyline
Visualizations. IEEE Transactions on
Visualization and Computer Graphics, 18 (2012): 2679-88.Tenen, D. Y. Toward
a Computational Archaeology of Fictional Space. New Literary History, 49 (2018): 119-47.Volpe, E. L. A
Reader's Guide to William Faulkner: The Novels. Syracuse University
Press, Syracuse, NY (2003).Wallace, B. Multiple Narrative Disentanglement: Unraveling Infinite Jest.
Proceedings of the 2012 Conference of the North
American Chapter of the Association for Computational Linguistics: Human
Language Technologies, 2012 Montréal, Canada. Association for
Computational Linguistics, 1-10.Wilson Jr, G. R. The Chronology of Faulkner’s A Rose for Emily Again. Notes on Mississippi Writers, 5 (1972): 44.Xie, H., Lu, X., Chen, X., Tong, J.
& Tang, Z. ES-ESens: Detection of Event Sentences Based
on Evaluation of the Explicitness and Significance of Information.
2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR),
March 2019. 32-35.