Abstract
The case of the Orlando Project offers a useful interrogation of concepts like
completion and finality, as they emerge in the arena of electronic publication. The
idea of “doneness” circulates discursively within a complex and evolving
scholarly ecology where new modes of digital publication are changing our conceptions
of textuality, at the same time that models of publication, funding, and archiving
are rapidly changing. Within this ecology, it is instrumental and indeed valuable to
consider particular tasks and stages done, even as the capacities of digital media
push against a sense of finality. However, careful interrogation of aims and ends is
required to think through the relation of a digital project to completion, whether
modular, provisional, or of the project as a whole.
When can a digital scholarly project be considered finally “done”? Perhaps never.
Something done is past, irrevocable, requiring nothing more and indeed immune from
further action. The case of the Orlando Project, a large-scale and longstanding digital
humanities undertaking, reveals an arbitrariness, even a fictiveness or
contradictoriness, to the notion of completion of the project as a whole or even of its
major online product. Digital humanities projects are considerably more prone than
traditional humanities undertakings to riding off into the sunset until the next
installment rather than being laid to rest. “Doneness” circulates discursively
within a complex and evolving scholarly ecology where new modes of digital publication
are changing our conceptions of textuality, at the same time that models of publication,
funding, and archiving are rapidly changing. Within this ecology, it is instrumental and
indeed valuable (indeed, as Matt Kirschenbaum suggests here, highly satisfying) to
consider particular tasks and stages done, even as the capacities of digital media push
against a sense of finality. However, careful interrogation of aims and ends is required
to think through the relation of a digital project to completion, whether modular,
provisional, or of the project as a whole.
Projection and Experimentality
In the digital humanities we often organize undertakings in terms of “projects”,
research endeavours that are probably, ideally, a collaborative enterprise “carefully planned to achieve a particular
aim”
(
Oxford English Dictionary 2007, “project”). The emphasis is on the future, on the projected outcome and potential of the
undertaking: projects, as the cognate verb “to project” suggests, are
future-oriented. Some — an example would be the nora Project [nora] — last about as
long as the money from a particular grant, but others — the Perseus Project and the
Women Writers Project are examples — continue over many years and multiple grant
funding cycles. A successful project is thus not necessarily geared to realizing a
“particular aim”. Perseus as “an evolving digital
library”
(
http://www.perseus.tufts.edu/) situates its work in the vast scale of biological time; the WWP’s aims are
equally open-ended. These trajectories of digital humanities undertakings don’t pin
themselves to a specific end, but carry “A planned or proposed undertaking; a
scheme, a proposal; a purpose, an objective”
(
Oxford English Dictionary 2007, “project”) into the foreseeable future, with gusto. Such an orientation is actually at
odds with the definition of a project in relation to particular aims. The success of
these projects is not pegged on completion, but measured in other ways.
So there's clearly a lot that scholars involved with such projects want to do without
being done, particularly insofar as being undone is compatible with disseminating
materials to others and engaging in scholarly dialogue about them. At root is not
only, as the introduction here suggests, a culture of perpetual prototypes that
mitigates desire for closure, or funding structures that poorly support the
“finishing” process for non-commercialized projects. It is also the very
multi-faceted nature of much digital humanities research, which so often straddles
the divide between content development and technological experimentation. This
interplay between traditional humanities content and innovative methodologies means
there is always more to be done.
The Orlando Project, with its aim of “producing the first full scholarly
history of women's writing in the British Isles”
(
Brown et al. 2006, home page), is a long-term digital humanities project that is both done and yet not done.
Unlike many electronic projects, the project held off making its major resource
available until it was in quite a polished and complete state.
Orlando: Women Writers in the British Isles from the Beginnings to the
Present was published online by Cambridge University Press in June 2006.
It is not a collection of primary texts, but a massive born-digital resource in
literary history amounting now to almost 7.7 million words in the form of 1,206
detailed and often quite lengthy entries on writers’ lives and writing careers, more
than 13,000 independent chronology items, and 22,000 bibliographical records. Yet the
project is far from done: its content and technical work continue. This paper
explores the tension between projection and completion over this project’s history to
date as a means of considering that tension in relation to digital humanities
research generally.
The project’s cofounders (Susan Brown, Patricia Clements, and Isobel Grundy) were new
to digital humanities research, so our notions of scholarly process and completion
related to conventional print publications. As Claire Warwick has noted, the idea of
what is “complete” or “publication-ready” in academic culture has emerged
from a complex set of human factors relating to such matters as the attribution of
credit by institutions and funding structures, as well as the conception of what is
required intellectually for a product to be done [
Warwick 2004, 368]. Such factors undoubtedly entered into how the Orlando Project was conceived. In
our original funding application, we projected a single moment of completion at which
the planned electronic history would be ready alongside several related print volumes
of scholarship. We were fortunate to receive a Major Collaborative Research
Initiative (MCRI) grant from the Social Sciences and Humanities Research Council of
Canada (SSHRC), and embarked on the project as planned. And then things changed.
Working at the interface between humanities research questions and evolving digital
methods means that projections about the trajectories of digital humanities work are
less likely to be accurate than those of traditional scholarship. This is not to say
that any research project may not run into snags or unforeseen delays - instances,
particularly in the history of earlier scholarship, include Samuel Johnson's having
to restart his dictionary mid-stream because he realized that working with small
slips of paper would be better than the old technology of full sheets - but these are
less often related to the methodology per se of the scholarly undertaking. In the
case of Orlando, the ambition and experimentality of what we had undertaken on the
technical side had a radical impact on the progress of the literary work with which
it was interdependent, both because key researcher time was involved in the
development of the custom tagset we developed and successively refined as a key
component of our methodological experimentation, and because we had to build in-house
production and delivery systems from scratch in ways that we had not anticipated. The
risk of these sorts of impacts is endemic to methodologically experimental research
of any kind, and particularly relevant to digital humanities work. Such impacts don’t
mean that the project is not pursuing its aims effectively, but they can have a major
impact on anticipated timelines and perceptions of productivity, especially if the
project has been articulated in relation to a particular aim or deliverable.
Modularity and Incrementalism
Digital humanists therefore need to plan and sequence with care their deliverables,
which are important not only because our work must take objective form to be shared
with our colleagues, but also because those are the ways in which we are accountable
to the funding bodies that make our often costly work possible. Given the
risk-oriented nature of experimental research, it is strategic to promise outcomes
that are both multiple and modular. The Orlando Project struggled for funding in
later stages as a result, we believe, of a project design that focused on a single,
end-loaded monumental deliverable.
The big “ta-da!” moment of publication is a very common strategy, one followed
by the Blake Archive in 1996 with its release of The Book of
Thel, copy F, and then again in 1997 with further fanfare when it released
the first SGML version of that text with additional functionality for users, and in
2008 by The Nineteenth-Century Serials Edition, which launched all six serials with a
splash. The “ta-da” provides both that crucial sense of satisfaction and
progress for the participants and a landmark achievement that constitutes important
evidence of completion of at least a phase of the project for funding agencies. It
has drawbacks, however. Focus on an end deliverable can obscure interim
accomplishments. The Orlando Project's research plan was designed to proceed through
a number of stages. Indeed, the milestones and the mid-term review required by the
MCRI Program are examples of the kind of official part-done marker which, although it
may not arise organically from the research needs or achievements and is imposed from
the outside by the bureaucratic rules of another entity, can nevertheless be used by
researchers as a spur to setting and meeting meaningful interim goals. Many of those
markers were, however, internal to the project, which meant that they didn't provide
the same kind of objective sense of progress that comes with public release.
Externally, and for our funding agency, what registered was what we had not done,
rather than what we had accomplished. An immense online “product” such as that
Orlando promised from the outset does well then to be balanced by some objective,
interim goals. In addition, the launch that suits a book does run somewhat counter to
the ongoing life of many digital projects if it leaves people thinking that the
project itself is finished.
Other projects have made their way into the world rather differently, in ways
suggestive of ongoing curation. The Poetess Archive, for instance, began as a series
of rather modest web pages that have grown over time in both scope and
sophistication, moving from HTML into XML with a sophisticated search interface.
Editorial ventures perhaps lend themselves particularly well to an incremental
approach. The Brown University Women Writers Project (
http://www.wwp.brown.edu), for instance,
first transcribed and encoded texts, made printouts available and partnered with a
publishing house, and released Renaissance Women Online. By the time Women Writers
Online was made available by subscription, it was clear that, although it was a major
event, it was part of a continuing project. Just as software projects typically put
out numbered releases, which provide the triumphant moment of celebration while
suggesting that more is yet to come, designing projects to incorporate such
incrementalism by way of staged releases that mark phases of accomplishment or a
number of discrete and in some way publishable deliverables, seems a particularly
useful way to structure digital projects.
So it seems crucial to design digital humanities projects with a number of discrete
and in some way publishable deliverables. Ideally, these should be modular, that is,
functionally independent of one another. This means each has the potential for
separate funding, can proceed on its own, and can provide a satisfying moment of
completion. Modularity, however, suits some kinds of projects better than others.
Both content and software systems often rely on the interrelation of various parts
that can make it a challenge for one part to develop independently of the others. And
even where a high degree of modularity is possible, modules usually need to be
integrated at some point, so careful coordination to ensure eventual compatibility is
still necessary, as well as an eventual convergence of module completion.
Various factors can work against modular publication. Orlando’s content structure was modular in form, composed as it was of
author entries and chronological materials. Each of these theoretically could have
been published as soon as they were “done”. Yet doneness there was relative:
there occurred a regular effect whereby the production of a new entry spurred
significant improvement in several supposedly complete ones. An iterative process
developed, not unlike the successive stages required in traditional humanities
research, where the gradual accretion of knowledge slowly modifies the researcher’s
view and understanding of material. There was a strong sense both that the content
work had to progress to a certain point of intellectual maturity, and that there were
intellectual demands for a certain degree of coverage. We wouldn’t be “done”,
for instance, without having completed the materials on Virginia Woolf or George
Eliot. Because much feminist work has resisted the establishment of a small canon of
female writers at the expense of others, such major writers needed to be situated in
relation to less prominent contemporaries, and because we rejected a separatist
understanding of literary history, we needed to include some male and international
writers. Despite the apparent modularity of our content, we held off publishing until
we had 1,149 entries completed. Thus, where scholarly content is concerned, a certain
critical mass may be held necessary to establish scholarly confidence in the quality
of a resources. Whether that threshold constitutes a single digital object, such as
an edited text, or thousands of objects will vary. But it can work against a modular
approach. Further revision is of course possible: the Orlando entries on Eliot and Woolf continue to be extended or revised at
almost every update. In this sense, the digital done with its easy accommodation of
incrementation is infinitely preferable to the printed done. But any project wishing
to publish in stages will have to decide its initial content threshold according to
the particular research goals of the project, criteria in the field for scholarly
reliability, and user expectations.
Technical considerations constitute a further challenge to modular publishing, since
a prototype is one thing and a debugged, multiple-browser-supporting, polished
publication vehicle is another. We know that users are very easily put off by
frustration in the use of new resources or tools, so publishing components that are
unstable or poorly integrated may have a seriously negative impact. In the case of
Orlando, our customized tagset required us to build a
fairly complex XML delivery system, a task we had not anticipated in the mid-90s when
TEI-SGML was emerging as a standard and XML was just over the horizon. Only a quite
finished interface, we felt, stood a chance of convincing our core users from the
technologically-resistant field of literary studies of the strengths of the markup
into which the project had invested so much intellectual labour.
Orlando offers users a range of affordances beyond that of looking up
specific writers’ entries, as the menu bar on the home page as it was at initial
release (see
Figure 1) makes clear. These extend to
searching in quite precise ways on the more than 2 million semantic tags embedded in
its literary-historical prose. To make the system’s unique strengths apparent, we
again needed a critical mass of materials to populate search results and showcase
innovative features — such as the links screens that provide semantically-categorized
access to mentions of writers across the textbase (see
Figure
2).
Orlando shifted to a more staged publication model by uncoupling the electronic from
the print publication, so that the former stands alone initially. Yet the textbase
was published relatively complete. Thus, while structuring projects modularly is
highly desirable for a range of reasons, truly modular publication may present
challenges with respect to audiences from beyond the digital humanities community.
Research domain, project conceptualization, and publication options are all crucial
determinants of how “done” will be defined for a particular project. Project
members need to arrive at a shared understanding of what constitutes an acceptable
degree of intellectual maturity, critical mass of content, and technological finish
at initial publication. This is particularly important since projects often seem to
be judged by both funders and traditional humanities users according to their state
at first release, as if they were a book. Once a first set of material is released,
staged publication — such as the addition of new components, functionalities, or
alternative interfaces — and incrementation — such as additions to or enhancement of
existing content — become easier. However, in project planning, it seems
strategically important for researchers to stress to funders the value of interim
publications and subprojects, and generally not to allow a major deliverable to
swallow up the identity of a project as a whole, so that the perception that the
former is “done” does not carry with it a sense that the latter is also
finished. Release or version numbers, or other ways of flagging the open-endedness of
an electronic publication may be helpful in this regard.
Digital Textuality and Publication
Digital projects, if they aim to move beyond prototypes and court a mainstream
humanities user community, need to recognize at the planning and budgeting stage the
very high overhead involved in the development of delivery systems robust and usable
enough to be considered in some sense finished.
[1] We need to think through with our funding agencies not only how to
sustain digital publications over the long haul, but also how to help projects with
hugely valuable content leap that imposing hurdle from prototype to polished
publication. At the same time, digitally published may not mean “done” in
several respects.
Published is traditionally done, as David Sewell argues in his essay in this cluster.
But published electronic projects don’t get put on a shelf in a library. Being
unconstrained by print materiality reinforces the arbitrariness of deciding that
something is done in the sense of “complete”, which is defined in the Oxford English Dictionary as “Having all its parts or members; comprising the full number or
amount; embracing all the requisite items, details, topics, etc.; entire,
full.” Published may mean (provisionally) done without meaning complete,
and there is of course a long tradition of encyclopedic print publications issuing a
series of updates or supplements. Digital publication allows us to define done in
terms of the kind of intrinsic completeness suggested by the OED rather than because we’ve reached an arbitrary limit (a deadline, a
word length) related to print processes. In this sense, Orlando, though published, remains incomplete. Though all the items we
considered requisite for initial publication are there, we remain aware of those
figures, topics, approaches, and perspectives that demand inclusion in a “full”
history of women’s writing in the British Isles. Our contract with our publisher
recognises the provisionality of our completion by stipulating for updates, as well
as in the plan for the volumes of discursive history. We’ve increased and enhanced
both content and functionality semi-annually since publication.
The “done” founded on digital publication is fragile in another sense because of
the rapidity of technological change. The stability of book technology means that a
book can be done and put to rest by both authors and publishers: even if it goes out
of print, so long as copies endure in libraries they can continue to be used in
perpetuity. But digital publications require more active support. Even if no
technological enhancements are desired, for an electronic text to remain usable, it
has to be stored somewhere in a form that is accessible to evolving technologies.
This means it requires more active curation: even a quite straightforward web
publication becomes unusable if it can’t keep pace with browser releases. A new
version of a project produced to migrate with current standards and practices is
different from a second print edition in a number of respects. While both respond to
a perception of continued demand for the product, the electronic migration is
required to keep the resource accessible at all, and it does not supplement the first
edition, which in the case of print will persist, but materially speaking supercedes
or replaces it. This means that updates to electronic publications, while having a
decided formal edge over errata slips or supplemental volumes, bear the additional
burden of keeping the text in circulation. Being done with a digital publication may
mean that the work disappears entirely from use.
The potential evanescence of a project’s digital output creates pressures on the
scholar, team, or publisher to keep it available. The academic community is still
groping to discover how best to sustain digital publications over the long term. In
the meantime, to meet even modest needs for technical migration and to keep content
current, projects must continue to find funding, which can be challenging if a
project is perceived as done as a result of publication. The Orlando Project, as part
of its strategy of sustainability, licensed the textbase to the University of
Alberta, and the University in turn sub-licensed it to Cambridge University Press.
This arrangement created a revenue stream to help support the project’s preparations
for publication and its updates and ongoing activities. It also sustains Orlando’s
identity at its home institution and gives a broader constituency than the team
members an interest in the project’s success. This is important because, although
like other ongoing projects Orlando has been able to obtain research funding for new
initiatives, maintenance funding is a major challenge.
Part of the problem is in how we conceive of digital publications. Many ongoing
digital publications should be understood by analogy with journals, for whom
“done” can be applied to particular issues but not to the relevant research
area. Continuing work despite previous publication is then part of the mandate,
rather than the extraordinary burden it would seem in comparison with a book. The
analogy applies only in part, because of course the entire text of a digital
publication is fluid and subject to ongoing revision as that of a print journal is
not. But it helps conceive more appropriately what “done” might mean for a lot
of digital projects, with their capacity to increment and to migrate both
technologically and with their field, just as does the analogy of the library for the
Perseus Project.
Indeed, from a theoretical perspective, an electronic publication will arguably never
be “done” precisely because of the nature of electronic textuality. Print texts
are susceptible (as indeed were manuscripts and printed texts) to all sorts of
repurposing, from reissue through quotation and anthologizing, to reprinting or
incorporating in works of graphic art. In a digital environment, this aspect of
textuality is greatly intensified by the ease with which one can “sample” texts,
and the ability to separate content from presentation in digital formats means that
entire works can be readily reformed or deformed. To take a familiar digital activity
as an example, textual editing in an electronic environment must be reconceived as
involving several different modes of editing. A TEI-conformant XML edition can form
the basis of other quite divergent editions, such as an intentionalist rather than a
genetic or “fluid-text” edition such as the Rotunda Press edition of Melville’s
Typee
[
Bryant 2006]. Scholars will increasingly be able to build on existing
electronic texts, restructuring or adding to them, or recombining them with new
content to produce new texts. In a radical extension of earlier forms of textuality,
the possibility that an electronic text will continue to morph, be reproduced, and
live on in ways quite unforeseen by its producers makes “done” to an extent
always provisional.
Archives
The fact that electronic texts are not static leads to the thorny issue of archiving
them, surely a marker of some kind of doneness. For although the “digital
archive” is used loosely to refer to the total volume of material available in
digital form, attempting to do for digital culture what government, university,
museums, and other organized archives have done for print culture — preserve records
of the past so as to allow others to access it in the future, including selection,
arrangement, conservation, cataloguing — is a major challenge. Even the term
“archive” may suggest misleading parallels between older archival practices
and what is possible or appropriate for digital materials. People deposit books,
personal papers, or theses in an archive, where they remain, unchanged, unless a
medium like acid paper demands conservation, for future generations to consult. That
may be possible for some resources such as collections of static web pages, as
recorded by the Internet Archive, but for dynamic digital resources such as the
Orlando Project, archiving even a substantial set of web pages would be only the tip
of the iceberg. The Orlando Project has committed to archiving with the University of
Alberta Library, which currently entails a fairly well-defined set of practices
designed to ensure long-term survival of the data. However, current practices are
unlikely to be able to document either the dynamic text or the research process,
which was an experiment in large-scale humanities research and computing. We can
archive our internal materials, such as meeting minutes, policy documents, and so on.
We have an archive of all past versions of the documents that make up the textbase,
and of past versions of the delivery system (code and content), so that what it
looked like in the past is recoverable. But particular versions will only be
recoverable as we have machines that run the browsers and the coding behind them. We
need as a community to grapple further with the question of how to archive dynamic
resources.
Funding bodies such as SSHRC have policies requiring the public archiving of data,
even though many researchers are unaware of this requirement and despite the fact
that the country lacks the standards and indeed the facilities to permanently archive
digital material [
MacDonald 2007]. The notice one frequently encounters
accessing materials through the Library and Archives Canada Electronic Collection is
a sobering reminder of what may be lost: “You are viewing a document archived by Library and Archives Canada. Please note,
information may be out of date and some functionality lost”
[
Disclaimer]. We anticipate out-of-date information in an archive, of
course, but if considerable functionality is lost, a digital artifact can hardly be
said to have been archived successfully. And this site is devoted to archiving just
“monographs and periodicals” rather than more complex
artifacts. Archiving an experimental digital project must include the daunting task
of somehow preserving not just text but code and functionality, either by maintaining
systems on which they can run or migrating them to newer ones. If not, the project
will be not done but done for.
Evading the Archive
But archiving alone would represent a form of doneness that many digital projects
hardly seek. We want
Orlando to be up and running, to be
alive and evolving, being updated and used far into the future. Such longevity in
more than an archived state has major implications in terms of resources. Lack of
people, time, or funding has consigned more than one project involuntarily to
becoming a static tribute to its former activity. The reasons for this include people
moving on, intellectually or institutionally, without taking their projects along
with them, or people using electronic media to disseminate without particularly
desiring to exploit their potential for continual updating, but even where the will
to continue persists, inadequate funding mechanisms for sustainability may make it
impossible. This is a shame, since, as we have argued here, in the case of
Orlando and many other digital publications not only does
there remain the potential to enrich the contents, but the first iteration often
merely begins to tap the potential of the project’s data architecture and potential
for interface development. Yet once a project publishes a first major release of
materials, the assumption that the research is finished makes attracting funding more
difficult. While experiments with various models of maintenance and sustainability
proceed with both subscription-based and open-access digital publications, it is
clear that a fundamental shift is needed in the understanding of the value of project
sustainability and ongoing development, along with a concomitant shift of funding
models [
Unsworth et al. 2006, 28, 32].
Nor should sustainability be narrowly conceived. Informal user feedback on Orlando suggests that, at least with respect to encyclopedic
resources, users now firmly expect that scholarly digital publications will be kept
up-to-the-minute and respond to user suggestions. Most digital humanities projects
presumably would wish to benefit from this respect in which they remain
“undone”: if we are to evolve useful tools and resources we need carefully to
assess how people use them and experiment with ways of making them better. Such
inquiry is integral to the Orlando Project’s continuing research, since two of its
major aims were to establish the viability of extensive, domain-specific semantic
markup to enable new kinds of scholarship, and to help shift scholarly users of
electronic materials towards more complex engagement with electronic resources. The
project also aimed to leverage the markup in ways that we did not have the resources
to implement: whether we can ever have done with, that is to have realised, those
ambitions will depend on future developments.
The Orlando Project directs its research towards two practically inexhaustible
fields: women’s literary history and the capacity of computing — specifically of
extensive XML markup — to serve the needs of this area of humanities inquiry.
“Done” becomes, over the course of such an ongoing and complex digital
project, a strategic, continually negotiated marker valuable in a range of ways for
defining a specific stage of a process which is not unlike that of Lady Mary Wortley
Montagu’s solo periodical “To be continued as long as the Author
thinks fit, and the Public likes it”
[
Montagu 1993, 105]. Although
Orlando diverged radically from the
sense of authorship invoked here, Montagu conjures succinctly a dynamic relationship
between continued production and reception that is as true to the era of digital
production as it was to print culture in the eighteenth century. “Done” for
Orlando the textbase is only newly open, that is
beginning, for our users. For this project, so closely focused on a major
deliverable, the post-publication phase has simply intensified the importance of the
enquiry that binds our two fields of research: that of the relations between
Orlando and its users.
Although they are by no means all unique to digital publication, the factors outlined
here, ranging from project conception and design through modes of textuality and
publication to complications in sustainability and archiving, work collectively to
complicate what “done” means in the context of digital research. They come of
participating in a rapidly transforming context for research and publication in the
humanities. Many of these threads are tied together by a common concern that has not
been present for major projects that issue in print publication: the question of how
to make the results of the research continuingly available to others after the point
of initial publication. Whatever “done” means for a particular project, those
involved face the challenge of ensuring that it does not paradoxically mean a swift
end to scholarly circulation and contribution. While a comparison to the loss of the
library at Alexandria in the pre-print era might be a tad hyperbolic, it is sobering
to contemplate the waste of knowledge and intellectual effort that would result from
the failure of the academic community to resolve the thorny problem of how to sustain
access, over the long term, to the results of the first generation of experimental
endeavours in the digital humanities if we can’t figure out what is to be done.