The German Hamlets: An Advanced Text Technological Application

poster / demo / art installation
Authorship
  1. 1. Benjamin Birkenhake

    University of Bielefeld

  2. 2. Andreas Witt

    Universität Tübingen (University of Tubingen / Tuebingen)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The highly complex editorial history of Shakespeare’s Hamlet
and its many translations to German forms a interesting
corpus to show some of the major advantages current
texttechnologies. In the following we want to present the
corpus of four English editions an several German translations
of the play we have gathered together and annotated and
crosslinked in different ways.
Although Shakespeare’s Hamlet is obviously not a unique
hypertext, it is an interesting object to test advanced hypertextand
text technologies. There is no original edition of Hamlet,
which was authorized by Shakespeare during his lifetime. We
only have different print editions, which all have a different
status concerning their quality, overall length, content and
story-line. The most important among these are the so called
fi rst folio, the fi rst quattro and the second quattro edition of
Hamlet. During the centuries editors tried to combine these
early editions to the best edition possible. The famous Ardeneditons
as well as the in the internet widespread Moby-editon
are such compositions.
A comparable but a bit more complex situation exists within
in the fi eld of german translations of the play. The earliest
translation is by Christoph Martin Wieland from about
1766. After this at least 18 translations have been published
which are accompanied by countless translations of theatre
directors, which are mostly not documented. The corpus
contains 8 digitalized Translations. 2 further translation are
already scanned but not yet digitalized, because they are
printed in fraktur - a old german typeface - which can not be
recognized by common OCR-programs yet. The remaining 10
Translations are available in print, but not yet digitalized, too.
Of the 8 digitalized translations we chose 4 for further text
technological use.
What makes the corpus so interesting is the fact, that almost
every translator used several of the early english editions
as a basis for a new translation. This leads to a situation in
which almost every german or english edition of Shakespeare’s
Hamlet is a composition of several sources. The relation the
editions have with their sources and with each other form
a wide network, which could be presented in a hypertext
system.
Another interesting aspect of Shakespeare’s Hamlet is the
outstanding position the play has within the western culture
for centuries. Hamlet is the single most researched piece of
literature, has been analyzed from various perspectives and is a
part of western common education. This leads to the request,
that a digital environment should represent the variety of
perspectives on the play. This lead us to a corpus of Hamlet
editions in which each text may exist in multiple forms.
Basis for the XML-annotations are text fi les, which are
transformed to XML using regular expressions. The basic XMLformat
is TEI 4 drama base tag set. TEI 4 is a major open source
concept of the Text Encoding Initiative. The drama base tag set
offers almost all tags needed for a general, formal annotation
of a play. In order to provide an easy to annotate mechanism
we added some attributes to represent the translation- or
origin-relation between lines, paragraphs or speeches within
the editions on the one hand and the sources on the other
hand.
The TEI-annotated documents are used for further annotations
an presentation. The TEI-documents were automatically
enriched with further markup, using an open source autotagger.
This auto-tagger annotates single words, including the
part of speech and the principle form. The TEI-documents are
also the basis for the XHTML-presentation. As the TEIstructure
contains all information necessary for a graphical presentation,
these documents are transformed to XHTML, which is used to
present the corpus. This transformation is made with several
XSLT-Stylesheets. In the same way XSLFO is used to generate
PDF-versions of each edition. or the course of action. Therefore it is useful to provide an
alternative linking mechanism, which does not only focus on
the language and the formal structure, but also on the plot. To
provide this reference the narrative information is annotated
in another layer. This allows to fi nd the same event in different
translations of the play. The narrative annotation layer basically
consists of events, which can be seen as the smallest elements
of the plot.
Obviously, events may start within one line and end several
lines or even speeches later. Since the narrative structure
is overlapping with the TEI, both are stored in separate annotations. Scenes can provide a meaningful unit for basic
parts of the plot. Thus the formal and the narrative annotation
are semantically aligned - in addition to their reference on
identical textual data. This relation can be exploited by creating
links between the concept of a scene and the concept of
specifi c actions. The respective linking mechanism is located
on a meta level: it operates on the schemas themselves
and not on their instances. The references are generated
mechanically on the meta level, linking different perspectives
together. Readers can explore the relations between events
and scenes. The procedure could also be used to create a
recommendation system as e.g. proposed by Macedo et al.
(2003): the annotation integrates the knowledge of experts
on narrative structures in the play Hamlet and provides this
information to the reader. This leads to a multi rooted tree,
each tree represents one level of information, i.e. textual
structure and linguistic, philological or narrative information.
This allows for creating a network of multiple perspectives on
one text being linked to one another. As a result, hypertext is
no longer based on links between nodes, but offers a reference
mechanism between perspectives.
Figure 1: A multi rooted tree above a single textual data
As a fi rst result of these multiple annotations, we got a corpus
that is based on XML-technology and available via the web. As
a second result we developed methods to cope with multiple
annotated documents, which is a task, that has to be performed
more often with the growing popularity of XML-technologies.
Especially the integration of the narration annotation layer has
to bee seen as a example for further parallel annotations. In
detail these methods described above lead to an environment,
which offers different types of user different perspectives on a
single, textual object or a corpus. Some of these benefi ts will
be presented in the following:
1. The common TEI-annotation allows a structural linkingmechanism
between the editions. This allows a user to jump
from the fi rst scene in the second act of one edition to the
same scene in another edition.
2. Alternatively this annotation can be used to present the
user a part of the play in on or more editions of his choice
for direct comparison.
3. The narration annotation layer allows several ways to
explore a single text or compare some texts with each
other. In the fi rst case, the annotation of events and
actions provides a way of comparing different editions esp.
translations.
4. Using SVG - an XML-based format for graphics - the
narrative structure of each translation could be visualized,
ignoring the textual basis. This gives an »overview« of plot
of the current edition.
5. The introduced concept of cross annotation linking allows
us to offer the user automatically generated links from one
annotation to another.
With this set of different linking-concepts we offer users a
new freedom to explore the corpus in a way that fi ts to their
needs. We ensured, that every layer of information offers
a way to access information of another layer in a different
perspective. We assume that this method can be transferred
to any kind of multiple annotation.
References
Macedo, A. A., Truong, K.N. and Camacho-Guerrero, J. A.
(2003). Automatically Sharing Web Experiences through a
Hyperdocument Recommender System. In: Proceedings
of the 14th conference on Hypertext and Hypermedia
(Hypertext03), (Nottingham, UK, August, 26-30, 2003). Online:
http://www.ht03.org/papers/pdfs/6.pdf [Available, Last checked:
6/16/2005]

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2008

Hosted at University of Oulu

Oulu, Finland

June 25, 2008 - June 29, 2008

135 works by 231 authors indexed

Conference website: http://www.ekl.oulu.fi/dh2008/

Series: ADHO (3)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None