Mining the Differences between Penninc and Vostaert

paper
Authorship
  1. 1. Karina van Dalen-Oskam

    Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)

  2. 2. Joris van Zundert

    Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Middle Dutch Roman van Walewein (Romance of
Gauvain, ca. 1260) was written by two authors, Penninc
and Vostaert. Only one manuscript containing the complete
text, explicitly dated as copied in the year 1350, is left to us.
Some fragments of another, probably somewhat younger
manuscript contain about 400 lines. The text in the complete
manuscript consists of 11,202 lines of rhyming verse. The
manuscript was written by two clerks. The first seems to have
written the lines 1-5.781 and the second the lines 5,782-11,202.
The second author, Vostaert, explicitly claims to have added
about 3,300 lines to Penninc's text. Because scholars of Middle
Dutch literature came up with other amounts, we decided to
try out modern authorship attribution techniques to find out
whether these would point to a specific line in the text where
the text before and the text after contrasts most. We used a
lexical richness measure, Udney Yule's Characteristic K, and
Burrows's Delta, measuring the differences of frequencies of
the most frequent words in different parts of the text. We split
the text into largely overlapping parts of 2000 lines, moving
through the text in order to search for an exact line in the text
where the contrast before and after would be the most
significant. For measuring Burrows's Delta this meant that for
the sake of our focus on one text (or two, in a way), we
considered the text as a group of texts' and every part' of 2000
lines as a separate text, to be compared with the other 'texts'. Figure 1: Lexical Richness according to Yule's K.
At the conference in Gothenburg in 2004 we were able to show
that both measures yielded the lines 7,881-2 as the point of the
most contrast. In Fig. 1 we present the results of Yule's K for
that part of the text and in Fig. 2 the results of our creative use
of Burrows's Delta can be found. It is very intriguing that both
measurements point to the same place in the text. This suggests
that line 7,882 could very well be the place where Vostaert took
over from Penninc.
Figure 2: Differences in frequencies of the 150 most frequent words according
to Burrows's Delta
We continue our research by concentrating on a quantitative
analysis of the differences between the two parts of the text.
What are in fact the lexical differences between the text parts
before and after line 7,881-2? To find out, we made a list of
lemmata (headwords, comprising all spelling variants or
inflections etc. of a word) that occur significantly more in the
lines before and in the lines after. The top of this list looks as
follows: [etc.]
Summarizing, Penninc makes significantly more use of the first
and second person of the personal pronoun, in contrast to a
significantly higher use of the third person by Vostaert. Penninc
also applies a lot more modal verbs. But why? Are there several
reasons for these differences, or can all be explained by only
one or two special effects' of the individual authors?
The first hypothesis we will explore is that a difference in the
amount of dialogue between the two parts of the text may give
rise to several of the differences we have found. The paper will
investigate whether this is the case. We will present an analysis
of the vocabulary of both authors differentiating between
dialogue, narrator's text, and erlebte Rede' (narrated
monologue). We will also list other possibly differentiating
elements and test whether these play a part in the contrast we
discovered by using Yule's K and Burrows's Delta. This
qualitative phase in the research is meant to yield an overview
of elements contributing to the (quantitative) contrast on the
one hand, and to lead us to a list of key elements in the lexicon
of the two authors on the other. The list of actual differences
will be the input for a new quantitative and qualitative literary
analysis of the character and voice of Penninc and Vostaert.
Furthermore, we will look forward to the next purely
quantitative step we hope to take, in which the results of the
above can help us to establish a formula for authorship
distinction in the genre of Middle Dutch Arthurian Romance,
and help us, so to speak, to leap from the mining to the
modelling of the differences.
Bibliography
Burrows, J. "'Delta': a Measure of Stylistic Difference and a
Guide to Likely Authorship." Literary and Linguistic
Computing 17 (2002): 267-287.
Burrows, J. "Questions of Authorship: Attribution and Beyond."
Computers and the Humanities 37 (2003): 5-32.
Es, G.A. van, ed. De jeeste van Walewein en het schaakbord
van Penninc en Pieter Vostaert. 2 vols. : Zwolle, 1957.
Holmes, D.I. "Authorship Attribution." Computers and the
Humanities 28 (1994): 87-106.
Johnson, D.F., and G.H.M. Claassens, eds. Dutch Romances
I: Roman van Walewein. Trans. D.F. Johnson and G.H.M.
Claassens. Cambridge: Cambridge, 2000. Love, Harold. Attributing Authorship: An Introduction.
Cambridge: Cambridge, 2002.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2005

Hosted at University of Victoria

Victoria, British Columbia, Canada

June 15, 2005 - June 18, 2005

139 works by 236 authors indexed

Affiliations need to be double checked.

Conference website: http://web.archive.org/web/20071215042001/http://web.uvic.ca/hrd/achallc2005/

Series: ACH/ICCH (25), ALLC/EADH (32), ACH/ALLC (17)

Organizers: ACH, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None