Christine L. Borgman is Professor and Presidential Chair in Information
Studies at UCLA. She is the author of more than 180 publications in the fields of
information studies, computer science, and communication. Both of her sole-authored
monographs,
This is the source
The digital humanities are at a critical moment in the transition from a specialty area to a full-fledged community with a common set of methods, sources of evidence, and infrastructure — all of which are necessary for achieving academic recognition. As budgets are slashed and marginal programs are eliminated in the current economic crisis, only the most articulate and productive will survive. Digital collections are proliferating, but most remain difficult to use, and digital scholarship remains a backwater in most humanities departments with respect to hiring, promotion, and teaching practices. Only the scholars themselves are in a position to move the field forward. Experiences of the sciences in their initiatives for cyberinfrastructure and eScience offer valuable lessons. Information- and data-intensive, distributed, collaborative, and multi-disciplinary research is now the norm in the sciences, while remaining experimental in the humanities. Discussed here are six factors for comparison, selected for their implications for the future of digital scholarship in the humanities: publication practices, data, research methods, collaboration, incentives, and learning. Drawing upon lessons gleaned from these comparisons, humanities scholars are called to action
with five questions to address as a community: What are data? What are the infrastructure requirements? Where are the social studies of digital humanities? What is the humanities laboratory of the 21st century? What is the value proposition for digital humanities in an era of declining budgets?
A critical moment for the digital humanities
This is a pivotal moment for the digital humanities. The community has laid a foundation of research methods, theory, practice, and scholarly conferences and journals. Can we seize this moment to make digital scholarship a leading force in humanities research? Or will the community fall behind, not-quite-there, among the many victims of the massive restructuring of higher education in the current economic crisis? Much is at stake in the community’s ability to argue for the value of digital humanities scholarship and to assemble the necessary resources for the field to move from emergent
to established.
The sciences, arts, and humanities have converged and diverged in various ways over the
centuries. In the area of digital scholarship, many interests are in common across the
disciplines. It is the pace of adoption that is divergent. The sciences, and to a lesser
extent the social sciences, have been successful in developing the technical, social, and
political infrastructure for digital scholarship under the rubrics of
While leaving definitions of the humanities
to the reader, two complementary
definitions of digital humanities
provide a useful scope statement. Frischer’s
definition is the application of information technology as an
aid to fulfill the humanities’ basic tasks of preserving, reconstructing, transmitting,
and interpreting the human record
Digital
humanities is not a unified field but an array of convergent practices that explore a
universe in which print is no longer the exclusive or the normative medium in which
knowledge is produced and/or disseminated
Interest in the digital humanities has grown steadily for several decades. The Digital
Humanities Conferences have occurred annually since 1989, sponsored by the Alliance of
Digital Humanities Organizations. Constituent organizations of the Alliance have held
conferences since 1973
Despite many investments and years of development, basic infrastructure for the digital humanities is still lacking. Those who wish to gather and analyze digital data for humanities problems often find the overhead daunting, as exemplified by this emailed complaint from a history student in my scholarly communication course, who is pursuing a doctoral dissertation about the German enlightenment:
I’m finding that something as simple as constructing my maps of
related concepts are not easily applied to primary sources in digital libraries.
emphasis added; quoted with permission
Digital libraries,
the term used by my student, usually
implies the existence of tools, services, and a library imprimatur of cataloging and
curation. Her complaint is more about digital collections, which often lack basic
capabilities for retrieval or analysis. This distinction is particularly relevant to the
digital humanities. Content in digital collections may be relatively
raw,
as
Whose problem is it to improve the situation — that is, to design, develop, and deploy the
scholarly infrastructure for digital humanities? As my UCLA colleague, Johanna Drucker,
put it so well, Leaving it to
them
is unfair, wrongheaded,
and irresponsible. Them is uscritical juncture,
and is concerned that her fellow scholars are deferring responsibility for action to librarians, computer scientists, technology developers, publishers, and others.
The operant terms in digital humanities scholarship
are the latter two. Scholarly
methods are as deeply seated in the humanities as they are in the sciences
This article, based on a keynote presentation to the most recent Digital Humanities
Conference, reviews and reflects upon the differences between the approaches of the
sciences and the humanities to digital scholarship
The term
The technical and policy infrastructure for scholarship is being built rapidly,
particularly for the sciences
The humanities and the sciences each encompass broad swaths of scholarship, with much
internal diversity. These two communities have significant commonalities, while differing
in important ways. Identified here are six factors for comparison, selected for their
implications for the future of digital scholarship in the humanities: publication
practices, data, research methods, collaboration, incentives, and learning. The first five
of these are drawn from longer analyses published elsewhere
Scholarly journal publication is shifting rapidly toward electronic formats, especially
in the sciences. Some journals are dropping print publication altogether; others are
declaring the online version (usually released several weeks to several months prior to
the printed edition) to be the edition of record. Under pressure from authors, the
majority of scholarly journals now appear to allow online posting of some form of
pre-print or post-print
For physics and related areas of computer science and mathematics, arXiv is the locus
of scholarly communication. Monthly deposits of new papers now number more than 5,000;
the site, which contains over 500,000 papers, typically receives 50,000 visits per hour
In the humanities, neither journal nor book publishing has moved rapidly toward online publication, despite pioneering efforts such as the 1990 launch of the
zoomable images, video, GIS map integration, Adobe Flash VR, 3-D models, and online reference linking— while continuing to publish its static print version.
The reasons for the slow adoption of digital publishing in the humanities are many,
from not trusting online dissemination to a general reluctance to experiment with new
technologies, even those well proven — professionally indisposed to
change
as Ken Hamma puts it a suite of publishing services robust and flexible enough to
support the complexities of content, format, and dissemination that increasingly
define scholarly communications
The love affair with print
traditional
humanities scholarship at risk but also that
of digital humanities. The distinction between print and digital publication is as much
about epistemology as genre. Digital publishing is not simply repackaging a book or
article as a computer file, although even a searchable pdf has advantages over paper. By
incorporating dynamic multi-media or hypermedia, digital publishing offers different
ways of expressing ideas and of presenting evidence for those ideas
Digital publishing differs from print publishing in several ways. One is the shorter time from submission to publication. While speed of publication is a much greater concern in the sciences than in the humanities, much of that time delay involves the physical production of the journal or book. Reviewing time varies little between print and digital formats. The humanities could benefit from faster turnaround, reaching audiences much sooner.
A second advantage of digital publishing — even more critical — is the larger audience
for online publications. Anyone with an online connection and a subscription (in the
case of fee-paid content), anywhere in the world, can read digital publications. Only
those with access to a physical copy can read print-only publications. The number of
titles and the number of copies of scholarly books and journals published in print form
are decreasing rapidly, thus limiting both publishing outlets and readership. Maureen
Whalen’s concern for art history, with its continuing reliance on print publishing, is
that the voices of authority ... will be talking amongst
themselves
Two other consequences of the inexorable shift toward digital publication should be of
concern to the humanities. One is that print material — including older material —
becomes widowed
as students and scholars alike search only online. The widowing
problem was recognized early in the days of online catalogs, and was a major impetus for
research libraries to digitize their entire back catalogs rather than only records of
new material
The other consequence is that easier access to online material frequently increases its
rate of citation. Articles published in open access journals, open repositories, or
dual-published by providing preprints or postprints online, tend to have a citation
advantage over articles published only in closed-access journals, whether print or
online. The degree of advantage varies by field and by a number of other factors,
including how
While the details of these studies are much contested between authors, editors,
librarians, and publishers, the simple tautology that easier discovery is associated
with higher citation is difficult to dispute. As do authors in other fields, scholars in
the humanities desire recognition in the form of citations to their work. Universities
consider citation metrics in hiring and promotion decisions, despite known problems in
their use for evaluating scholarly productivity
In sum, the sciences have benefited from online publication in ways that the humanities have not (yet). Digital publication is faster, reaches a wider audience, and tends to increase the citation rate over print-only publication. As the proportion of print-only publication continues to decrease, those for whom it is their only venue risk reaching an ever smaller and more closed community with their scholarship. Curation of digital objects is a concern in all fields, and is a topic that has the attention of management in libraries and archives. Nonetheless, digital publication has become the norm, and those who cling to print publication as the only acceptable format for promotion and tenure may be left out of the academic mainstream.
Central to the notion of cyberinfrastructure and eScience is that data
have become
essential scholarly objects to be captured, mined, used, and reused. This trend has
been under way in science for many years, to varying degrees by field. As the technical and communications infrastructure became sufficiently robust to support large-scale data analysis and exchange, data became more valuable commodities. The availability of large volumes of data has enabled scientists to ask new questions, in new ways. Environmental scientists can conduct longitudinal analyses and make comparisons between locales using datasets compiled from multiple sources. Similarly, genome data offer analytical power at much finer granularity, and at larger scales.
While data
is a less familiar terminology in the humanities, the availability of large
text, image, audio, and multi-media corpora has a similar result, enabling scholars in
multiple fields to interrogate sources in new ways mining
or cultural analytics.
Data mining is the process of
identifying patterns in large sets of data . . . to uncover previously unknown, useful
knowledge
visual
analytics,
business analytics,
and web analytics,
and includes the
use of computer-based techniques for quantitative analysis and
interactive visualization
to identify patterns in large cultural data sets
The increasing value of data begs the question of what are data?
Definitions
associated with archival information systems offer a useful starting point: A reinterpretable representation of information in a formalized manner
suitable for communication, interpretation, or processing. Examples of data include a
sequence of bits, a table of numbers, the characters on a page, the recording of
sounds made by a person speaking, or a moon rock specimen
Another way to think about data is by origin. In the context of cyberinfrastructure,
the four categories of data identified in an influential U.S. policy report
The need to address categories and levels of data is a pragmatic concern for managing
information. Yet data are often in the eye of the beholder. In Buckland’s terms, data
are alleged evidence
Whether any given set of observation or records can be considered data depends on
context, even in the sciences. In our research on science and technology researchers in
the environmental sciences, we found differing views of data on concepts as basic as
temperature. Some of the computer science and engineering researchers interviewed said, roughly,
The
temperature is 98
is low-value compared to, the temperature of the surface,
measured by the infrared thermopile, model number XYZ, is 98.
That means it is
measuring a proxy for a temperature, rather than being in contact with a probe, and
it is measuring from a distance. The accuracy is plus or minus .05 of a degree. I
[also] want to know that it was taken outside versus inside a controlled
environment, how long it had been in place, and the last time it was calibrated,
which might tell me whether it has drifted…
Studies of scientific practice, such as our work in embedded sensor networks, is
providing insights for the design of cyberinfrastructure and eScience. The social
studies of science and technology is a large and burgeoning field, with multiple
journals and book series, and a scholarly society established more than 40 years ago
In the humanities, one person’s data is
another’s theory
personal communication, June 22, 2009
The sciences and humanities differ greatly in their sources of data and the degree of
control they have over those data
Scientists, generally speaking, use data that were created by and for scientific purposes. They usually generate their own data, as in field observations or laboratory studies, or may acquire data from collaborators or other scientists. They may also acquire data from repositories in their field or from government sites, such as records of rainfall or river flow. Scientific documentation such as laboratory and field notebooks is sometimes considered to be data and sometimes metadata.
The social sciences occupy the middle position between the sciences and humanities on a
continuum of data sources and control. Those at the scientific end of the scale gather
their own observations, whether opinion polls, surveys, interviews, or field studies;
build models of human behavior; and conduct experiments in the laboratory or field.
Other social scientists rely on records collected by others, such as economic indicators
or demographic data from the census. Government and corporate records are often of
interest, as are the mass media. A number of important data repositories exist,
especially for large social surveys (e.g.,
The humanities and arts are the least likely of the disciplines to generate their own data in the forms of observations, models, or experiments. Humanities scholars rely most heavily on records, whether newspapers, photographs, letters, diaries, books, articles; records of birth, death, marriage; records found in churches, courts, schools, and colleges; or maps. Any record of human experience can be a data source to a humanities scholar. Many of those sources are public while others are private. Cultural records may be found in libraries, archives, museums, or government agencies, under a complex mix of access rules. Some records are embargoed for a century or more. Some may be viewable only on site, whether in print or digital form. Data sources for humanities scholarship are growing in number and in variety, especially as more records are digitized and made available to the public.
Lynch’s dichotomy of raw material vs. interpretation has a number of implications for the digital humanities. Two are of concern here. One is that raw materials are more likely to be curated for the long term than are scholars’ interpretations of those materials. It is the nature of the humanities that sources are reinterpreted continually; what is new is the necessity of making explicit decisions about what survives for migration to new systems and formats. Second is the implication for control of intellectual property. Generally speaking, humanities scholars have far less control over the intellectual property rights of their sources — these raw materials — than do scientists, whose data usually are original observations or specimens. Typically, scholars can read, view, and cite cultural records, but often need explicit permission to reproduce them — and frequently need to pay a fee, especially in the case of images, to include them in reports of their research.
Intellectual property constraints on publishing of digital humanities scholarship are
much different than those that usually apply in other disciplines. Rights to reproduce
material remain closely tied to a print model, specified by number of copies printed and
by temporal rules on sale that are irrelevant to online publication. Even cultural
institutions as sophisticated as the Getty Trust encounter structural barriers to online
publication of humanities scholarship
In sum, what are data?
is an important question for the humanities. The answer
will determine what data are produced, how they are captured, and how they are curated
for reuse. Data sharing in the humanities is a complex set of issues — not that they
are simple in the sciences — that must be addressed. The humanities community needs a
critical mass of digital resources and needs common tools, services, and repositories if
they are to move beyond boutique projects
Questions of what are data?
are inextricable from the choice of research method.
Many of the sciences, especially those big science
areas that require large scale
instrumentation and produce vast volumes of data, are in transition to a data-driven
paradigm
An important case example of the changing role of data in science is the Sloan Digital
Sky Survey, begun in 1992 by Jim Gray, Alex
Szalay, and others
The Sloan Digital Sky Survey is significant for its openness, research productivity,
and community engagement, and because it instantiates the value chain
of scholarship
Humanities scholars are more likely to find their data sources in the library — their traditional laboratory — than in the skies. While the library continues to be more central to scholarship in the humanities than it is to other fields, the characteristics of that relationship are changing. The use of physical space and of library staff has changed radically in the last two decades, largely in response to flat or declining university library budgets. Campus libraries have been consolidated in efforts to minimize the number of public service points to be staffed. Books, journals, and other physical materials have been moved to remote facilities, paged from the stacks upon request. Professional librarians, while a smaller proportion of library staffs, are turning their attention away from collection building — given the budget crises — and toward making the best use of the materials they have. The sciences are placing less demand on the physical library, allowing university libraries to reconfigure their spaces to benefit faculty and students in the humanities. Prime floor space previously devoted to card catalogs, journals, and book stacks is now available for groups to work together with physical and digital resources. More librarians have backgrounds in the humanities than in the sciences, and many are eager to partner with humanities scholars in building better tools and services for discovering, interpreting, and using scholarly content.
At most universities today, humanities scholars and students are the primary constituency for physical books, journals, and records. This community also makes the finest distinctions among editions, printings, and other variants — distinctions that are sometimes overlooked in the transition from print to digital form. For general reading, any edition may suffice, and some degradation in image quality may be an acceptable tradeoff for access to large corpora of books and journals. Scholars are much more dependent on metadata to identify and compare variants, and may require physical copies to examine characteristics of printing and paper, annotations, and other details.
Differences in the methods of using print and digital objects are being thrown into
sharp relief by mass digitization projects, most recently by the intense public debate
over Google’s book-scanning project. Concerns include not only the quality of scanning
and of metadata, but the possibility that libraries will discard physical copies of
books for which scans are available
Digital humanities projects have yet to achieve the scale of data, audience, or
participation as the Sloan Digital Sky Survey. However, several long-lived digital
humanities projects have made important contributions to research methods and data
quality. Perseus is usually considered the first digital library in the humanities,
with planning begun in 1985 and services available by 1987 fly-throughs,
audio typical of the time period (including spoken Latin), and
gladiator fights in the amphitheater using the latest computer graphics technology.
Perseus, Rome Reborn, and newer projects such as HyperCities integrate map layers from
Google Earth and other sources, which broadens their scope, audience, and
interoperability with other components of the scholarly information infrastructure
In sum, choices of data sources, research methods, and research problems are
inextricably linked. Research methods in the sciences and in the humanities are becoming
more data-driven. The key to better
data — that is, data suitable for curation, reuse, and sharing — is capturing data as cleanly as possible and as early as possible in its life cycle. Agreements about data sources, structures, and formats will further the development of information infrastructure for digital humanities scholarship.
The size of collaborations is increasing in all fields, as measured by the number of
co-authors on papers, and at the fastest rate in the sciences
As noted above, the new forms of scholarship characterized by eResearch are information- and data-intensive, distributed, collaborative, and multi-disciplinary. Collaborations, when effective, produce new knowledge that is greater than the sum of what the participating individuals could accomplish alone. In fields where collaboration is the norm, graduate students learn teamwork, whether in the laboratory, the field, or in group work on data collection and analysis. Science dissertations frequently are carved out of larger group projects, with the student identifying a research problem worthy of sustained investigation. Funding agencies in the sciences consider dissertations to be important products of awards to faculty investigators. Dissertations and theses are listed explicitly in National Science Foundation annual reports, for example.
While the digital humanities are increasingly collaborative, elsewhere in the humanities the image of the lone scholar
spending months or years alone in dusty archives, followed years later by the completion of a dissertation or monograph, still obtains. Students often are discouraged from conducting dissertation research under a faculty grant. Instead, they are expected to spend yet more time identifying funding for solo research. When one is groomed to work alone and does so for the years required to complete the doctorate, collaborative practices do not come easily.
Friedlander argues that for digital humanities to thrive, one component must be a set
of organizational topics and questions that do not bind research into legacy categories
and do invite interesting collaborations that will allow for creative
cross-fertilization of ideas and techniques and then spur new questions to be pursued by
colleagues and students
An indicator of collaboration in the digital humanities community is the shift over the
last two decades from a focus on the audience — those who might learn or appreciate the
cultural content presented — to a focus on participation, in which scholars, students,
and the public can contribute content or conduct their own investigations
Scholarly collaboration is much studied but little understood. Among the predictors of
success are the ability to achieve a common vocabulary and shared knowledge
In sum, the digital humanities community could benefit from more collaborative partnerships within the field and between the humanities and disciplines such as computer science. Collaboration requires investment in listening skills, always being alert to nuanced differences in assumptions, theories, definitions, and methods. Lessons and skills learned from these partnerships can enhance the scholarship of all participants. Common technology platforms also are important to achieve interoperability and sustainability, and can be leveraged as investments across projects.
Constructing a critical mass of data sources for scholarship in any field presumes that people will share the products of their research. Because data and collaboration are so central to the methods of digital scholarship, data sharing is an important indicator of success for eResearch, although practices are somewhat different in the sciences and in the humanities.
The public nature of scholarship has deep roots. Notions of open science
date
back at least to Francis Bacon, with scientific findings being accepted only after peer
review. Scholars’ incentives to share their results include recognition and acceptance
of their work, which in turn drives hiring and promotion. In the sciences, authors may
be required to release data as a condition of publishing the papers on which they are
based. Funding agencies also are becoming more assertive about the release of data that
result from grants. However, publishing data is a far less mature practice than is
publishing books and articles. Releasing a major dataset rarely brings as much
recognition as releasing a major paper or book, but that balance is shifting, at least
in the sciences
Scholars compete as well as collaborate, and thus have reasons
The first disincentive is the most universal across disciplines. The sciences and medicine are under the greatest pressure to release their data. In these disciplines the reward structure is adapting, and repositories and data structures exist. While humanities scholars are under less pressure to release their data and sources, they are contributing models, modules, and tools to participatory projects and shared collections.
Data documentation is an issue in all fields, but as the volume of data increases, consistent documentation becomes progressively more necessary. Once data are captured cleanly, sharing them later becomes less of a problem. Humanities scholars are acutely aware of the importance of metadata and finding aids in discovering sources. Metadata are equally important for data curation. Scholars understand the roles that documentation must play, while librarians and archivists have the expertise in documentation standards, practices, and technologies. Data documentation is thus an obvious area of partnership for humanities scholars and information professionals, together addressing the requirements for sustainability of research products.
The third disincentive — competitive advantage — is often addressed in the sciences
through embargoes, whereby the investigators have a set period of time (from a few
months to a few years, depending on the field) after the end of a grant before being
required to share their data. Embargoes serve two complementary purposes: they protect
the scholars’ control over data, and they ensure that others will have access to the
data within a reasonable time period. In the humanities, scholars are similarly
concerned about controlling access to the sources of their data, whether the Dead Sea
Scrolls or a set of manuscripts in a university archive, until they have published their
research. As data sources such as manuscripts and out-of-print books are digitized and
made publicly available, individual scholars will be less able to hoard their sources.
This effect of digitization on humanities scholarship has been little explored, but
could be profound. Open access to sources promotes participation and collaboration,
while the privacy rules of libraries and archives ensure that the identity of
individuals using specific sources is not revealed. Libraries and archives endeavor to
maintain privacy in the use of digital as well as print sources. However, when digital
content is controlled by commercial entities, protecting the privacy of users is a
greater concern
Intellectual property, the fourth disincentive to share data and sources, is the most
intractable. The need to establish data sharing agreements in collaborative projects
arose early in eScience initiatives and is far from resolved
In sum, the digital humanities encounter most of the same incentives and disincentives for sharing data and sources faced by the sciences and by other disciplines. The details play out somewhat differently, of course. The need to build critical masses of cultural sources and interoperable technology platforms affirms the need to broker agreements about data. If the infrastructure for the digital humanities errs toward openness, as is the norm in much of the sciences, the field will advance more quickly.
The last comparison between the sciences and humanities, but by no means the least, is
the role of information technology in learning. from K to
grey
Several of the recommendations for advancing the state of cyberlearning have analogies
for advancing the state of digital humanities. One is the need to build a vibrant field
by promoting cross-disciplinary communities, publishing best practices, and recruiting
diverse talents. The Cyberlearning Task Force made a careful distinction between
cyberlearning as learning
Another analogous recommendation from the cyberlearning report is the need to instill a
platform perspective.
As noted earlier, the takeup rate of digital learning
modules has been limited by reliance on unique tools, proprietary software, and general
lack of interoperability. Unless products are easily adapted to new uses, others have
little incentive to invest in them. Both cyberinfrastructure and cyberlearning
initiatives are constructing common technical platforms that will improve the
sustainability and reuse of tools, services, and content. Some of these technical
platforms can be leveraged for digital humanities scholarship. Where capabilities are
lacking, the community can work in concert to construct them. Common platforms and
standards are among the goals of the Mellon-funded Bamboo project, for example Project Bamboo 2009.
The Cyberlearning Task Force also recommended initiatives to enable students to use
data. By embedding data skills early in the science curriculum — in the primary grades
where feasible — students can learn to think like
scientists
early on. Hands-on science approaches endeavor to engage students
in real
science, making it more interesting and exciting than purely textbook
approaches
Lastly, the Task Force made a strong recommendation to the NSF to promote open
educational resources. Educational content resulting from cyberlearning grants should be
made available online with permission for unrestricted use and recombination. New
proposals for research and development in cyberlearning should include plans to make
their materials available and sustainable. These recommendations are relevant to all
disciplines. Open educational resources are growing rapidly in variety and number
Openness matters for the digital humanities for reasons of interoperability, discovery,
usability, and reusability. Open resources — that is, those that can be used under
license or are in the public domain — are more malleable for research and for learning.
They can be mixed up and mashed up, and others can add value to them. Resources that
are available via open repositories also are more readily discovered than those posted
on local websites
In sum, cyberlearning is important for the digital humanities for a number of reasons. One is the need to learn how to use and how to evaluate digital cultural materials early; graduate school is rather late. Second is the need to build common technology platforms for digital humanities scholarship, which will advance the field by leveraging efforts and resources and by increasing interoperability. Third is the value of open access to resources, which then become more malleable for research and for learning. Last is the need to build a strong community of digital humanities scholars, one that represents a much larger portion of the humanities than is the case today.
My student’s complaint,
A number of developments in cyberinfrastructure, eScience, and eResearch offer guidance to the digital humanities community in the quest to become a more established field with a broader base of infrastructure. One is in the area of publication practices. The humanities lag in digital publication of journals and books. Digital publishing, while far from a panacea, offers a number of advantages in the speed, scope, and format of communication. Scholarly print publishing is on the decline, and those who publish only in print form risk being isolated, talking only to each other. More digital-only venues are needed, where dynamic and visual work can be published in its vernacular form.
Another area is the dissemination and use of data. The humanities community should continue to clarify their choices of data and data sources, for these will drive what content is produced, captured, managed, and available for reuse. Questions of data are closely related to research methods, which also are evolving. Data-driven research methods are most valuable when they enable scholars to ask new questions in new ways.
Collaboration is essential in digital humanities projects. Few individuals have the range of expertise required to execute these projects alone. Humanists should continue to seek out complementary partners and encourage people to listen and learn from each other. Working together is also more likely to lead to common platforms and other means of reducing the overhead of technical projects.
In both the sciences and the humanities, incentives to share one’s writing are more obvious than are incentives to share one’s data and sources. In the sciences, data release is being encouraged (or required) by journals and funding agencies, and data-driven research methods can draw upon large corpora that grow as new observations are contributed. In the humanities, data release is less of an issue, but the availability of common technical platforms, tools, and services will promote the sharing of data and sources. The disincentives to share are complex in both the sciences and the humanities, but are being addressed. As the sciences learn how to share data and to share credit for their findings, the humanities can build upon their best practices. Intellectual property constraints remain a major stumbling block, and the considerations vary between the sciences and the humanities.
Opportunities for using cyberinfrastructure for learning exist in all disciplines. Distributed access to scholarly content, common technical platforms, and open resources will advance the humanities as well as the sciences.
In the process of developing the keynote presentation for the 2009 Digital Humanities Conference and in writing this paper, I consulted many individuals in the digital humanities community for their thoughts on the issues facing the field. From these discussions and my analyses above, five pressing problems emerged.
What constitute data in the humanities? What are data sources? How are they made, shared, valued, used, and reused? Answering these questions will enable the digital humanities community to be more articulate about its scope and its goals, and better positioned to identify their requirements for infrastructure.
The sciences have struggled with this question for a decade or two already. They have
convened workshops and study panels, and launched funding initiatives addressed
specifically at defining, designing, and deploying the necessary infrastructure for
eScience. The humanities have tackled this question on a much smaller scale, leaving
them in the position of building upon the infrastructure constructed by and for other
disciplines. As Johanna Drucker put it so well, them is
us
Why is no one following digital humanities scholars around to understand their practices, in the way that scientists have been studied for the last several decades? This body of research has informed the design of scholarly infrastructure for the sciences, and is a central component of cyberinfrastructure and eScience initiatives. Given how rapidly scholarship in the humanities is evolving, it is fertile ground for behavioral research. The humanities community should invite more social scientists as research partners and should make themselves available as objects of study. In doing so, the community can learn more about itself and apply the lessons to the design of tools, services, policies, and infrastructure.
This is a question of great concern to research libraries as well as to humanities scholars. The library continues to be a laboratory for the humanities, but not the only laboratory. Humanities scholars run computing laboratories and may work in distributed virtual environments for research and for learning. Humanists need to partner both with librarians and with the information technology planning and policy groups on their campuses. These communities urgently need to think together
about the common challenges faced in a time of shrinking budgets for collections, physical space, staffing, and technology services.
For universities, the current economic recession is like no other. Public and private universities alike are re-examining core principles as budgets are slashed by 10% to 30% from one year to the next. Nothing is sacred, and because it’s beautiful
is not a viable economic argument. The sciences have been remarkably effective at making the argument for their value in economic and political terms, whether to university administrations, legislatures, funding agencies, or the general public. While the humanities will have difficulty making parallel arguments in terms of economic competitiveness and medical advances, they have plenty to offer in terms of cultural understanding, writing and design skills, and critical thinking. Digital scholarship also promotes technical skills, which can be highlighted.
Digital projects require resources in the form of computers, software, staff, and content. Non-digital scholarship also costs money, of course, but more often in the form of travel and subsistence expenses for research in remote archives. Tradeoffs in travel and digitization can be made more explicit. The number of people who will use and benefit from any given project also can be made clearer. Investments in common technical platforms and standards that leverage resources across larger numbers of people and projects are easier to justify.
The digital humanities community has produced some beautiful work and made many advances in technology, design, and standards. Now is the moment to consolidate that knowledge and to articulate the community’s requirements and goals. Go forth and do great things...
I am grateful to the colleagues who provided thoughtful commentary on an earlier draft of this paper, including Murtha Baca, Gregory Britton, and Maureen Whalen of the Getty Trust; Johanna Drucker, Alberto Pepe, Todd Presner, and Katie Shilton of UCLA; Amy Friedlander, Council on Library and Information Resources; Bernard Frischer, University of Virginia; Alexander Parker, Harvard University; and two anonymous reviewers.
Many other people were very generous with their time in response to my inquiries about the past, present, and future of the digital humanities, including (in alphabetical order) William Dutton, Oxford Internet Institute; Neil Fraistat, University of Maryland; Richard Furuta, Texas A&M; Kimberly Garmoe, Anne Gilliland, UCLA; Charles Henry, Council on Library and Information Resources; Jason Hewitt, UCLA; Jieh Hsiang, National Taiwan University; Marina Jirotka, University of Oxford; Matthew Kirschenbaum, University of Maryland; Clifford Lynch, Coalition for Networked Information; Lev Manovich, University of California, San Diego; Ann O’Brien, Loughborough University; Susan Parker, UCLA; Allen Renear, University of Illinois; David Robey, University of Oxford; Ben Shneiderman, University of Maryland; Harold Short and Paul Spence, King’s College, London; Joshua Sternfeld, UCLA; Sarah Thomas, Bodleian Library; Sharon Traweek, UCLA; Anne Trefethen, University of Oxford; John Unsworth, University of Illinois; Sarah Watstein and Robert Winter, UCLA.
Open Scienceand the balance between private property rights and the public domain in scientific data and information: A primer.
hits) on citation impact: a bibliography of studies.