Edition Visualization Technology: a simple tool to publish digital editions and digital facsimiles

poster / demo / art installation
Authorship
  1. 1. Raffaele Masotti

    Università di Pisa

  2. 2. Julia Kenny

    Università di Pisa

  3. 3. Chiara Di Pietro

    Università di Pisa

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Digital Vercelli Book project1 aims at creating a digital edition of the Vercelli Book manuscript (MS CXVII, Biblioteca Capitolare di Vercelli, Italy), a fundamental document for the Old English language and literature area of studies. After the transcription of all the texts was completed using the TEI schemas2, the researchers lacked a tool that would allow to publish a diplomatic edition of the former together with the digitized manuscript images.

Many tools have been created in order to publish digital editions based on XML-encoded texts and meet the various needs of users. Some software tools (such as TEI Boilerplate) allow to publish TEI XML files directly in modern browsers. Even though this method is very light and user friendly, it is both unsuitable for publishing very large files and limited to XSLT 1.0; the typical output is a single web page for each XML document, which is not appropriate for the publication of image-based editions.

This led us to the design and implementation of EVT (Edition Visualization Technology), a lightweight software for that specific purpose. What started as a project, whose goal was the digital edition of a specific manuscript, has grown well beyond that, to the point of being available and usable as a general purpose publishing tool.

EVT can be used to create image-based web editions with different edition levels starting from a multi level encoded text. It is built on open and standard web technologies, such as HTML, CSS and Javascript, to ensure that it will be working on all the most recent web browsers. During the development of the software we have constantly tested it on different cases of study (for instance the NZTEC corpus), in order to verify that the code was actually compatible with the different kinds of TEI P5 encoding.

The general architecture of the software is modular, so that any component can be easily upgraded or replaced. To overcome the limitations hinted above the data processing is carried out by means of a static transformations with XSLT 2.0. EVT's processing system uses a collection of stylesheets to divide the XML file into smaller portions (each corresponding to individual pages of the manuscript), and for each of these parts it creates as many output files as requested by the file settings (for instance, diplomatic and diplomatic-interpretative). On the image side, the system offers a ready-to-use set of instruments (such as a magnifying lens, a general zoom, hot spots) to fully take advantage of manuscripts’ scans. One of the most important tools available in EVT is the image-text link: if the XML files make use of the <facsimile> element and provide the coordinates that identify the lines in manuscript scans, the EVT building system will automatically create clickable areas on the image, which will be handled thanks to CSS and Javascript interaction.

Fig. 1: Main view

The most important feature is that the code is working as a client-only infrastructure, so that the user may immediately make use of it, without requiring any server software installation. This is apparently a limitation, since a client-server architecture would offer many more features, but thanks to the processing done with the XSLT 2.0 processor in the initial phase of the building process the system is also ready to perform complex operations (such as textual searches) which generally rely on interpreted languages and a server back-end.

What has been described up to this point constitutes the core of our project, that is a complex and modular collection of XSLT scripts coupled with text and image browsing tools. The modules are designed in such a way that the scholar will be able to add his own stylesheets to manage the different levels of the edition, and this will not influence the other parts of the system, f.i. HTML generation. Applying the XSLT scripts usually requires the use of specific editors (such as Oxygen), which may be not the user’s favorite tool and add one more step to the process. This is why we decided to create EVT Interface: a Java user interface that allows to avoid loading the XML files in an editor and makes directly use of the open source versions of the Saxon processor to apply XSLT modules to the TEI XML files.

Our main objective is to create a flexible, modular, stand-alone package which is also freely redistributable.

We are now facing several research questions (UI evolution, search engine, embedded transcription, critical edition) that we hope to address during the next few months.

References
1. Vercelli Book Digitale. vbd.humnet.unipi.it/ (accessed on March 2014).

2. Burnard, L., and S. Bauman (eds.) TEI P5: Guidelines for Electronic Text Encoding and Interchange. 2.6.0.www.tei-c.org/P5/.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO