The Homer Multitext: Infrastructure and Applications

panel / roundtable
Authorship
  1. 1. William Blackwell

    Furman University

  2. 2. Neel Smith

    College of the Holy Cross

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Homer Multitext (HMT) seeks to create a library of
materials documenting the history of the Homeric tradition,
all of which will be freely available online. To support this
aim, Neel Smith and Christopher Blackwell, in collaboration
with other scholars, librarians, and technologists, have spent
fi ve years defi ning a set of protocols for a distributed digital
library of “scholarly primitives”. The protocols aim to be as
technologically agnostic as possible and to afford automatic
discovery of services, the materials they serve, and the internal
structures and citation schemes.
In winter 2007, the initial contents of the Homer Multitext
were published online using the fi rst generation of web-based
applications built atop implementations of these protocols.
In this presentation we will describe and demonstrate these
protocols and discuss how they contribute to the vision of the
Homer Multitext.
The TICI Stack
The technical editors of the Center for Hellenic Studies call the
set of technologies that implement these protocols the “TICI
Stack”, with TICI being an acronym for the four categories
of materials offered by the Homer Multitext Library: Texts,
Images, Collections, and Indices.
Texts
The Texts components of the HMT are electronic editions
and translations of ancient works: transcriptions of specifi c
manuscripts of the Iliad and Odyssey, editions of Homeric
fragments on papyrus, editions of ancient works that include
Homeric quotations, and texts of ancient commentaries on
Homeric poetry. These texts are marked up in TEI-Conformant
XML, using very minimal markup. Editorial information is
included in the markup where appropriate, following the
EpiDoc standard. More complex issues that might be handled
by internal markup (complex scholarly apparatus, for example,
or cross-referencing) is not included; these matters are handled
by means of stand-off markup, in indices or collections.
Access to texts is via implementations of the Canonical Text
Services Protocol (CTS), which is based on the Functional
Requirements for Bibliographic Records (FRBR). CTS extends
FRBR, however, by providing hierarchical access from the
broadest bibliographic level (“textgroup”, or “author”), down
through “work”, “edition/translation”, and into the contents of
the text itself. The CTS protocol can point to any citeable unit
of a text, or range of such units, and even more precisely to a
single character. For example, the CTS URN:
urn:cts:tlg0012.tlg001:chsA:10.1
points to Homer (tlg0012), the Iliad (tlg001), the edition
catalogued as “chsA” (in this case a transcription of the text
that appears on the manuscript Marcianus Graecus Z. 454
[= 822]), Book 10, Line 1. A CTS request, handed this URN,
would return:
Ἄλλοι μὲν παρὰ νηυσίν ἀριστῆες Παναχαιῶν
A more specifi c URN would be:
urn:cts:tlg0012.tlg001:chsA:10.1:ρ[1]
would point to the second Greek letter “rho” in this edition of
Iliad 10.1, which appears in the word ἀριστῆες.
Images
The HMT currently offers over 2000 high resolution images
of the folios of three Homeric manuscripts, captured in the
spring of 2007 at the Biblioteca Nationale Marciana in Venice.
These images are accessible online through a basic directorylisting,
but also through a web-based application, described
below. To relate these images with bibliographic data, technical
metadata, and associated passages of texts, the HMT relies on
Collections and Indices.
Collections
A collection is a group of like objects, in no particular order.
For the CTS, one example is a Collection of images, with each
element in the collection represented by an XML fragment
that records an id, pointing to a digital image fi le, and metadata
for that image. Another example is a Collection of manuscript
folios. Each member of this Collection includes the name of the
manuscript, the folio’s enumeration and side (“12-recto”), and
one or more CTS URNs that identify what text appears on
the folio. Collections are exposed via the Collection Service,
which allows discovery of the fi elds of a particular collection,
querying on those fi elds, and retrieval of XML fragments. A
fi nal example would be a collection of lexicon of Homeric
Greek.
Indices
Indices are simple relations between two elements, a reference
and a value. They do much of the work of binding together
TICI applications.
The Manuscript Browser
The fi rst data published by the HMT Library were the images
of manuscripts from Venice. These are exposed via the
Manuscript Browser application, the fi rst published application
built on the TICI Stack.
http://chs75.harvard.edu/manuscripts
This application allows users to browse images based either
on the manuscript and folio-number, or (more usefully) by
asking for a particular manuscript, and then for a particular
book and line of the Iliad.
Users are then taken to a page for that folio. Online interaction
with the high-resolution images is through the Google Maps
API, and AJAX implementation that allows very responsive
panning and zooming, without requiring users to download
large fi les.
From the default view, users can select any other views of that
page (details, ultraviolet photographs, etc.), and can choose to
download one of four versions of the image they are viewing:
an uncompressed TIFF, or one of three sizes of JPEG.
This application draws on a CTS service, two indices, and
a collection of images. It queries a CTS service to retrieve
bibliographic information about manuscripts whose contents
are online--the CTS protocol is not limited to electronic texts,
but can deliver information about printed editions as well.
When a user requests a particular book and line of the Iliad, it
queries one index whose data looks like this:
<record>
<ref>urn:cts:greekLit:
tlg0012.tlg001:1.1</ref>
<value>msA-12r</value>
</record>
Here the <ref> element contains a URN that, in this case,
points to Book 1, Line 1 of the Iliad, and the <value> element
identifi es folio 12, recto, of “manuscript A”. This allows the
application to query a collection of manuscript folios and to
determine which images are available for this folio.
The display of an image via the Google Maps API requires it to
be rendered into many tiles--the number depends on the size
and resolution of the image, and the degree of zooming desired.
For the images from Venice, we produced over 3,000,000
image tiles. This work was made possible by the Google Maps
Image Cutter software from the UCL Centre for Advanced
Spatial Analysis, who kindly added batch-processing capability
to their application at our request.
Next Steps
Currently under development, as of November 2007, is the
TICI Reader, a web-based application to bring texts together
with indexed information and collections. This is intended
to coincide with the completion of six editions of the Iliad,
electronic transcriptions of important medieval manuscripts of
the Iliad, and a new electronic edition of the medieval scholarly
notes, the scholia, that appear on these manuscripts.
Availability
All of the work of the HMT is intended to be open and
accessible. The images and texts are licensed under a Creative
Commons License, all tools are based on open standards
and implemented in open source technologies. And work is
progressing, and more planned, on translations of as many
of these materials as possible, to ensure that they reach the
widest possible audience.
Bibliography
Canonical Texts Services Protocol: http://chs75.harvard.edu/
projects/diginc/techpub/cts
Creative Commons: http://creativecommons.org/
EpiDoc: Epigraphic Documents in TEI XML: http://epidoc.
sourceforge.net/
Google Maps API: http://www.google.com/apis/maps/
IFLA Study Group on the Functional Requirements of
Bibliographic Records. “Functional Requirements of
Bibliographic Records: fi nal report.” München: K. G. Saur,
1998: http://www.ifl a.org/VII/s13/frbr/frbr.pdf
TEI P4: Guidelines for Electronic Text Encoding and Interchange.
Edited by C. M. Sperberg-McQueen and Lou Burnard. The TEI
Consortium: 2001, 2002, 2004.

Conference Info

Complete

ADHO - 2008

Hosted at University of Oulu

Oulu, Finland

June 25, 2008 - June 29, 2008

135 works by 231 authors indexed

Conference website: http://www.ekl.oulu.fi/dh2008/

Series: ADHO (3)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None