Image as Markup: Adding Semantics to Manuscript Images

Hugh Cayless

Authorship

1. Hugh Cayless

New York University

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

In their article “Images as the Text: Pictographs and Pictographic
Logic,” Jerome McGann and Johanna Drucker
argue for the semantic significance of the image of text,
using as an example the organization of the text of a
poem by Byron. They conclude:
...a rhetoric of transparency makes it difficult to see beyond
the moves within the text and the image to understand
the metagraphic logic organizing them. In such a
situation, the study of pictographs, which hover at the
borderland of text and image, can be especially useful.
They help us to see that at the next level of abstraction, of
texts and images as information, similar logical mechanisms
are at work. Each instance of text and image is an
incarnation of such a metalogic, but it can be articulated
according to its own rules and principles if it is rendered
explicit. [McGann]
In the publication of manuscript transcriptions, two
modes of presentation are typically recognized: the edition,
in which an editor’s supplements are folded into
the text, and the diplomatic transcription, which attempts
to faithfully reproduce the text on the original support.
With the advent of markup systems like the Text Encoding
Initiative (TEI), it is possible to produce both types
of transcription from the same marked up text. Indeed,
it is possible to go further, and analyze the text in ways
that print transcriptions cannot, and to link transcriptions
(and notes) to images in new ways.
As part of a series of experiments in text and image linking,
beginning in the summer of 2008, the author has
developed a method for generating a Scaleable Vector
Graphics (SVG) representation of the text in an image of
a manuscript. [CaylessSVG] This method employs open
source tools to generate and present the results of the
SVG tracing. Automated analysis of the SVG output of
the process is capable (even using a naïve algorithm) of
detecting lines in the source, and it is not hard to conceive
of ways to detect words and other features in the image.
The output of the tracing is in the form of SVG path
elements, which employ a combination of cubic Bézier
curves [Bézier] and lines to draw shapes. These can be
grouped together (using svg:g elements) to, forexample,
gather the components of a line of text as children of
a single element. The SVG may include a copy of the
original image as background, therefore superimposing
the vector graphic tracings on top of the raster image.
Since SVG is an XML application, this means that the
features it represents are manipulable with Javascript,
offering the possibility of highlighting features, panning
and zooming, adding hyperlinks, etc.
In examining ways one might link between an XML
transcription of the text and an XML overlay of the text,
one quickly runs into problems involving overlapping
hierarchies: paths may include multiple letters or words,
for example, and there may be letters that correspond to
multiple paths. The process of generating the SVG tracing
involves the conversion of the image to a black and
white (1-bit) bitmap, wherein each pixel is either black
or white. This makes it possible for the software to reproduce
the shapes in the original source in vector format,
but it also involves a flattening of the text in the
image into a single space. While it might have been clear
that the stroke of one letter runs over the top of a second
in a color image, that layering is lost in the SVG, and the
two letters are a single shape in the output. This may,
of course, involve the descender of a letter ‘f’ touching
the line below, for example, making line detection more
difficult.
The figure below (derived from a papyrus fragment1)
highlights some of these issues. Notice that the initial
kappa is represented by no less than eight paths,
while part of the downward stroke of the final alpha in κατάξοντα connects to the following word, ἅ.
These features of the traced text mean that marking a
word in the image is inherently difficult. Possible solutions
include modifying the SVG with an editor, so that
the two letters no longer connect, or adding a new element,
perhaps just a line added by an editor, to divide
the two. The former solution involves doing a certain
amount of violence to the image, however, since the two
letters do in fact touch, while the latter introduces a new
issue: the lack of semantics.
The semantics of SVG are almost purely geometric.
It primarily encodes shapes, with additional support
for links, text, embedded images, and animation. This
means there is no inherent way to express the significance
of a grouping or a feature in SVG. If we introduce
the idea of a line that can break paths representing letters
or words, then in order to be able to use this feature to,
for example, point to the whole word κατάξοντα in the
SVG, we would first need a process that could find the
intersection points of the word-dividing line and the path
representing the two alphas, and then split that path into
two derivative paths, each of which would be associated
with a different word. This method would avoid damage
to the actual tracing while allowing the types of reference
that are likely to be useful. The derivative paths
could be placed in the same document and only activated
as needed.
Again, however, there is a need for semantics in the SVG
document. Not only might it be necessary to differentiate
between lines dividing letters and lines dividing
words or lines of text so that a processor knows how to
deal with them and their outputs, but it must also be possible
to distinguish between the derived paths and the
originals, since there is no inherent difference between
one svg:path element and another. They may render in
the same coordinate space, for example, even though
they are in different parts of the document.
Some possibilities for adding semantics to SVG include
embedding metadata (perhaps using RDF) using
the svg:metadata element, or developing a microformat,
perhaps depending on the @class attribute, which is
available on all displayable elements.[Schepers] These
kinds of semantic “hooks” will be absolutely necessary
if linking between the many possible structures in the
transcription and the SVG are to be achieved. Some examples
of the types of functionality that can be enabled
by the technology described here are:
1. linking from notes, such as a description of letter
forms, to actual examples inthe image.
2. The ability to link/zoom to any part of the text in the
image from the transcription.
3. Image search with highlighted results.
4. Marking editorial emendations, such as expanded
abbreviations and other types of editorial addition
or deletion on the image itself.
Put another way, the semantics of the graphical representation
of the text can be made explicit, by means of
embedding and linking information into markup that itself
overlays an image of the text, making for a very rich
presentation and research tool.
Notes
1See http://papyri.info//navigator/full/apis_
michigan_1769
References
Bézier curve - Wikipedia, the free encyclopedia. Retrieved
November 14, 2008, from http:// en.wikipedia.
org/wiki/B%C3%A9zier_curve. [Bézier]
Cayless, H. “Linking Page Images to Transcriptions
with SVG.” Presented at Balisage: The Markup Conference,
12 - 15 August 2008. In Proceedings of Balisage:
The Markup Conference (2008). Retrieved November
14, 2008, from http://www.balisage.net/ Proceedings/
html/2008/Cayless01/Balisage2008-Cayless01.html.
Drucker, J., & McGann, J. “Images as the Text: Pictographs
and Pictorgraphic Logic.” Retrieved October
28, 2008, from http://jefferson.village.virginia.
edu/%7Ejjm2f/old/ pictograph.html. [McGann]
Schepers, D. A. “Reinventing Fire >> Blog Archive >>
SVG Text, Semantics, and Accessibility.” November
7th, 2006 at 5:24 am. Retrieved November 13, 2008,
from http://schepers.cc/? p=11. [Schepers]

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2009

Hosted at University of Maryland, College Park

College Park, Maryland, United States

June 20, 2009 - June 25, 2009

176 works by 303 authors indexed

Conference website: http://web.archive.org/web/20130307234434/http://mith.umd.edu/dh09/

Series: ADHO (4)

Organizers: ADHO

Image as Markup: Adding Semantics to Manuscript Images

1. Hugh Cayless

ADHO - 2009