Francesca Frontini, Université Paul-Valéry Montpellier 3 - Praxiling UMR 5267 CNRS - UPVM3; Mohamed Amine Boukhaled, Laboratoire d'Informatique de Paris 6 (LIP6 UPMC) / Labex OBVIL; Jean-Gabriel Ganascia, Laboratoire d'Informatique de Paris 6 (LIP6 UPMC) / Labex OBVIL

Abstract [en]

Exploratory Search Through Visual Analysis of Topic Models

Patrick Jähnichen, Machine Learning Group, Humboldt-Universität zu Berlin; Patrick Oesterling, Image and Signal Processing Group, Leipzig University, Germany; Gerhard Heyer, Natural Language Processing Group, Leipzig University, Germany; Tom Liebmann, Image and Signal Processing Group, Leipzig University, Germany; Gerik Scheuermann, Image and Signal Processing Group, Leipzig University, Germany; Christoph Kuras, Natural Language Processing Group, Leipzig University, Germany

Abstract [en]

Diachronic trends in Homeric translations

Yuri Bizzoni, University of Gothenburg; Marianne Reboul, Université Paris-Sorbonne; Angelo Del Grosso, Institute for Computational Liguistics A. Zampolli

Abstract [en] In this paper we intend to present a tool we developed for translation studies and diachronically compare various French translations of the Odyssey. This field of study is part of the more general “Classical Receptions” studies that try to analyse the influence and adaptation of classical texts in modern and contemporary literature, theatre, cinema, and many other artistic fields. While Greek texts have been analysed by scholars for more than two thousand years, research about classical translations is not yet a most renown subject. In recent years this theme has raised a growing interest in the academic community. We developed a program that can align textual sequences (defined as groups of words delimited by a specified grammatical pivot, in our case proper nouns), without need of previous training. We obtained alignments for many different kinds of translationsEven free translations, a problem that wasn’t generally considered by textual aligners since recent studies. While other programs have an upper bound for one-to-many alignments (for example with a maximum of four translated elements aligned to the same original element) this algorithm allows an indefinite number of alignments, both for the source sequences and the target ones. The aligner is based on an implementation of Needleman-Wunsch algorithm and on a string-based similarity approach to textual segments. The aligner needs to establish proper names as anchor words, as they are a relatively stable feature through different translations and tend to be similar in several languages. Thanks to the alignments obtained using the program, we can explore translations in a number of ways. We will illustrate the creation of a graphical interface to visualize French Homeric translations. With our tool, it is possible to highlight aligned portions of texts and show their immediate differences or similarities, both in meaning and in syntactic distribution. We will show some resulting syntactic analyses carried out on a small sample of texts, taken from a corpus of twenty-seven unabridged French translations of the Odyssey and explore how the study of diachronic translations through algorithms of computational linguistics can produce interesting results for literary and linguistic studies.

Comparing Disciplinary Patterns: Exploring the Humanities through the Lens of Scholarly Communication

Daniel Burckhardt, Humboldt-Universität zu Berlin

Abstract [en]

Friedrich Kittler's Digital Legacy – PART I - Challenges, Insights and Problem-Solving Approaches in the Editing of Complex Digital Data Collections

Jürgen Enge, FHNW Academy of Art and Design, Basel; Heinz WernerKramski, German Literature Archive Marbach

Abstract [en]

Friedrich Kittler's Digital Legacy – PART II - Friedrich Kittler and the Digital Humanities: Forerunner, Godfather, Object of Research. An Indexer Model Research

Susanne Holl, Berlin, Germany

Abstract [en]

Automated Pattern Analysis in Gesture Research: Similarity Measuring in 3D Motion Capture Models of Communicative Action

Daniel Schüller, Natural Media Lab, Human Technology Centre, RWTH Aachen University; Christian Beecks, University of Münster; Marwan Hassani, Data Management and Exploration Group, RWTH Aachen University; Jennifer Hinnell, Department of Linguistics, University of Alberta; Bela Brenger, Natural Media Lab, Human Technology Centre, RWTH Aachen University; Thomas Seidl, Ludwig Maximilian University of Munich; Irene Mittelberg, Natural Media Lab, Human Technology Centre, RWTH Aachen University

Abstract [en]

Articles

OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus

Uwe Springmann, LMU Munich & Humboldt-Universität zu Berlin; Anke Lüdeling, Humboldt-Universität zu Berlin

Abstract [en] This article describes the results of a case study that applies Neural Network-based Optical Character Recognition (OCR) to scanned images of books printed between 1487 and 1870 by training the OCR engine OCRopus on the RIDGES herbal text corpus (in press). Training specific OCR models was possible because the necessary ground truth is available as error-corrected diplomatic transcriptions. The OCR results have been evaluated for accuracy against the ground truth of unseen test sets. Character and word accuracies (percentage of correctly recognized items) for the resulting machine-readable texts of individual documents range from 94% to more than 99% (character level) and from 76% to 97% (word level). This includes the earliest printed books, which were thought to be inaccessible by OCR methods until recently. Furthermore, OCR models trained on one part of the corpus consisting of books with different printing dates and different typesets (mixed models) have been tested for their predictive power on the books from the other part containing yet other fonts, mostly yielding character accuracies well above 90%. It therefore seems possible to construct generalized models trained on a range of fonts that can be applied to a wide variety of historical printings still giving good results. A moderate postcorrection effort of some pages will then enable the training of individual models with even better accuracies. Using this method, diachronic corpora including early printings can be constructed much faster and cheaper than by manual transcription. The OCR methods reported here open up the possibility of transforming our printed textual cultural heritage into electronic text by largely automatic means, which is a prerequisite for the mass conversion of scanned books.

Past Visions and Reconciling Views: Visualizing Time, Texture and Themes in Cultural Collections

Katrin Glinka, University of Applied Sciences Potsdam; Christopher Pietsch, University of Applied Sciences Potsdam; Marian Dörk, University of Applied Sciences Potsdam

Abstract [en] We present a case study on visualizing a collection of historic drawings along its metadata structure while also allowing for close examination of the artifacts’ texture. With regards to the specific character of cultural heritage at the intersection of research, education, and public interest, the presented visualization environment aims at meeting the requirements of both researchers as well as a broader public. We present the results from a collaborative interdisciplinary research project that involved a cultural heritage foundation, art historians, designers, and computer scientists. The case study examines the potential of visualization when applied to, and developed for, cultural heritage collections. It specifically explores how techniques aimed at visualizing the quantitative structure of a collection can be coupled with a more qualitative mode that allows for detailed examination of the artifacts and their contexts by displaying high-resolution views of digitized cultural objects with detailed art historical research findings. Making use of latest web technologies, the resulting visualization environment allows for dynamic filtering and zooming of a collection of visual resources that are arranged along a contextualized timeline. We share insights from our collaborative design process and the feedback and usage data gathered during the deployment of the resulting prototype as a web application. We end with a discussion of transferability of carefully crafted and collaboratively negotiated visualizations of cultural heritage and raise questions concerning the applicability of our approach to related strands of humanities research.

To Visualize Past Communities: A Solution from Contemporary Practices in the Industry for the Digital Humanities

Gérald Péoux, Université Paris Ouest Nanterre La Défense & Institut d'Histoire Moderne et Contemporaine (CNRS); Jean-Roch Houllier, Thales University

Abstract [en]

Shakespeare’s Tragic Social Network; or Why All the World’s a Stage

James Lee, University of Cincinnati; Jason Lee, Independent Scholar

Abstract [en]

An Enlightenment Utopia: The Network of Sociability in Corinne

Chloe Edmondson, Stanford University

Abstract [en]