RolandHT and Reconceiving the Notion of Corpus

Authorship
  1. 1. Vika Zafrin

    Brown University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

RolandHT
For the past several years I have been pursuing the study
of a semi-fictional character's manifestations in different
art forms, throughout Europe and North America, in the last
thousand years. Roland is the protagonist of the French epic
Song of Roland and France's national hero. He first appeared
in written record at the end of the eleventh century, and hasn't
left us since then – although his character has undergone
significant changes. He has easily passed from one culture to
another; but Roland is not merely borrowed – he is mercurial.
He has been used by writers and artists as a container for the
promotion of moral and political exigencies relevant to their
circumstances. Thus, in France he is a staunch protector of the
Christian faith (Song of Roland); in Germany, a religious martyr
(Konrad); in Italy, a lover and a comedic character (Boiardo,
Ariosto); in England, a mystical traveler (Browning); in the
United States, a roadie (Silverstein) and a gunslinger (King).
Nevertheless, the French Roland has easily identifiable
personality traits, such as his infamous pride (which cost him
and twenty thousand of Charlemagne's best men their lives)
and his physical prowess. There are also artifacts, such as his
enchanted sword and his sounding horn that can be heard for
miles. These features have prompted me to make a case for the
existence of a Roland corpus that is held together by recurrent
themes and imagery. In my dissertation titled RolandHTI show
these recurrences by encoding them in XML.1 This corpus has
never been considered as a whole. An examination of themes
recurrent in it offers insight into the evolution of art and the
pace of intercultural transmission. This insight is relevant to
the study of literature at the dawn of Western secular writing;
but it also sheds light on our own literary present, in an age of
widely accessible electronic writing, when the processes of
composition and transmission are changing again.
The Roland corpus consists of works in such diverse media as
literature, live and puppet theater, opera, sculpture and stone
carvings, painting, film, contemporary music, and comics. This
presents a problem in my use of the word corpus, which
conforms to its theoretical definition but not to any of its prior
use in practice. I propose that the electronic medium, and
semantic encoding in particular, allow for the study of corpora
bound not by format or authorship, but by semantic content.
RolandHT is a collection of excerpts from works that feature
Roland, encoded for recurring characters, imagery and themes.
Some annotations are present. Works span most of Europe and
the United States, and date from between the late 1090s to 1995.
Most of the corpus objects are text-based – poems, prose, drama,
operatic lyrics. One particularly interesting corpus object –
Greg Roach's computer game The Madness of Roland – is a
blend of text and images. Other image-based corpus objects
include comic-book panels and photographs of medieval
stonework found in France and Croatia. Included are also
excerpts from the 1999 film Beowulf, in QuickTime (.mov)
format. Other objects will be added as time and copyright
restrictions allow.
Technical Details
The project is processed for web presentation using XSLT,
JavaScript and CSS. The website is a technical
collaboration with Ethan Fremen.
Our approach is standards- and client-based. All features except
bookmarking can be used with static files accessed locally. This
permits RolandHT to be archived on a CD, DVD or other
storage device; access to a networked server is not necessary
to view the work. This is an advantage for the purposes of
submitting a doctoral thesis, but being entirely client-side, our
method would not scale well: a substantially larger corpus will
require an XML database in order to remain efficient. As it
stands, the roughly 800KB XML file containing all text data
(but not the separately-stored images or multimedia files) must
be loaded entirely before the project is viewed, rendering it
unwieldy for slow connections if RolandHT is accessed over
the web. However, once it is loaded, subsequent operations are
performed client-side and connection speed no longer matters,
except when fetching the aforementioned images and
multimedia files.
The notion of corpus
The notion of corpus is used in the study of languages
(corpus linguistics), literature, and objects (architecture,
archaeology). Definitions of _corpus_ provided by several
different dictionaries of English and of literary terms are
variations of "a related 'body' of writings, usually sharing the
same author or subject-matter." (Baldick 52) It is almost never
further defined in the context of specific studies. Instead, corpus
is taken for granted, and is found far more often in headings
than in discussion. When it is used in the text itself, the set of
objects under consideration invariably has an author, time period and/or geographical area in common. None of these things can
usefully delimit the Roland corpus. However, some past uses
of corpus would justify the use of the word to describe works
about Roland.
Most often the contested word is used in corpus linguistics.
One book in particular defines corpus as "a subset of an ETL
[electronic text library], built according to explicit design
criteria for a specific purpose." (Ghadessy et al. 179) This is
the only non-dictionary definition I have found, and if we
substitute "Western artistic production" for "ETL," it becomes
relevant to the construction of RolandHT. The purpose of
defining it is to shed light on how its protagonist has survived
so unusually long while at the same time undergoing
fundamental changes. However, the subjective nature of the
definition process demands a willingness to alter design criteria
as texts are encoded, and a leap of faith that scholars may be
reluctant to make. Working with a semantically encoded body
of work requires acceptance of XML as a valid format for
making a scholarly argument. This in turn calls for critical
responses that are also at least partly expressed in XML. ("I
disagree with this encoding. How about this alternative
instead?")
Roland is a flexible corpus that emerges and evolves in the
course of its analysis. Its unifying subject may be a default
connecting thread by virtue of Roland's name, but since the
actual connecting threads are recurrent themes and imagery,
the way to ascertain a work's status as a corpus object (or not)
is a close reading of it. In this case, micro-results of the close
reading are recorded by XML encoding.
The above describes a practical application of Gregory Ulmer's
theory of heuretics. In his 1994 Heuretics: The Logic of
Invention, Ulmer states that theory is assimilated into the
humanities in two principal ways: by critical interpretation and
by artistic experiment. Heuretics is the latter – it is a heuristic
approach to theory, a reading process that, instead of attempting
to theorize "what might be the meaning of an existing work,"
guides "a generative experiment: Based on a given theory, how
might another text be composed?" (5)
Why not use words that have been used more frequently to
describe Roland, such as myth or legend? Both of these terms
imply homogeneity that the corpus does not possess, and deny
Roland's historicity, however tenuous. Similarly, the terms
closest to corpus – oeuvre and canon – are limiting to the point
of inaccuracy: the Roland corpus has too many authors to be
called oeuvre and too little institutionalized authority to be a
canon.
Finally, RolandHT is not an archive. It does not strive for
completeness, either within single works (only excerpts are
presented) or within the corpus (which is, to date, representative
but incomplete). More importantly, RolandHT is intended to
be the opposite of a static collection of unchanging documents;
its primary-source contents, encoding and annotations are all
meant to change as research progresses.
Summary
If semantic encoding is to be taken seriously as a research
tool by philologists, art historians and other humanists used
to working in more traditional ways, a common language must
first be built and justified. At Digital Humanities 2007 I hope
to contribute to the effort of expanding the inflluence of digital
research methods into mainstream humanities. I will do this by
expanding the notion of corpus to include bodies of works that,
because of their disparate formats, do not lend themselves to
formal study using traditional methods.
1. The electronic part of the dissertation can be found at <http:
//wordsend.org/rht/xml/index_2006.php>.
N.B.: it is a work in progress. New content is being added often.
Please note also that the project can only be viewed using
Firefox/Mozilla or another XSLT-aware browser.
Bibliography
Ariosto, Ludovico. Ed. Rudolf Gottfried. Orlando Furioso;
Selections from the Translation of Sir John Harrington.
Bloomington, IN: Indiana University Press, 1963.
Beowulf. Dir. Graham Baker, Perfs. Christopher Lambert,
Rhona Mitra, Oliver Cotton and Götz Otto. DVD. Panorama
Entertainment, 1999.
Baldick, Chris. The Concise Oxford Dictionary of Literary
Terms. Oxford: Oxford University Press, 2004.
Boiardo, Matteo Maria. Orlando Innamorato. Trans. Charles
Stanley Ross. Berkley, CA: University of California Press,
1989.
Browning, Robert. "Childe Roland to the Dark Tower Came."
English Poetry Full-Text Database. Cambridge, UK:
Chadwick-Healey Ltd. Accessed 2004-03-26.
Ghadessy, Mohsen, Alex Henry, and Robert L. Roseberry, eds.
Small Corpus Studies and ELT: Theory and Practice.
Amsterdam: J. Benjamins Pub. Co., 2001.
King, Stephen. Dark Tower I-VII. New York: Signet,
1989-2003.
Konrad, Der Pfaffe. Priest Konrad's Song of Roland. Trans. J.
W. Thomas. Columbia, SC: Camden House, 1994. Roach, Greg. Madness of Roland. CD-ROM. Los Angeles, CA:
HyperBole Studios, 1995.
The Song of Roland. Trans. Dorothy L. Sayers. Harmondsworth,
UK: Penguin Books, 1957.
Silverstein, Shel. "Roland the Roadie And Gertrude the
Groupie." Belly Up!. Perf. Dr. Hook and the Medicine Show.
CBS, 1973.
Ulmer, Gregory L. Heuretics: The Logic of Invention.
Baltimore, MD: Johns Hopkins University Press, 1994

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2007

Hosted at University of Illinois, Urbana-Champaign

Urbana-Champaign, Illinois, United States

June 2, 2007 - June 8, 2007

106 works by 213 authors indexed

Series: ADHO (2)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None