From Text to Image to Analysis: Visualization of Chinese Buddhist Canon

paper
Authorship
  1. 1. Lewis Lancaster

    University of California Berkeley

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

This presentation is based on software interface
development by a team at the University
of California, Berkeley. The database which
was used for this technology is the digital
version of the Korean Buddhist canon written
in Chinese characters. The tool shown was
built with the help of a two year grant of
support (2007-2009) from the National Science
Foundation. International collaboration has
included the Institute of Tripitaka Koreana in
Seoul who provided scanned images of rubbings
taken from the original printing blocks at Hae-in
Monastery. The software metadata is based on
the previous publication
The Korean Buddhist
Canon: A Descriptive Catalogue
(Lancaster,
1979).
1
A digital version of this catalogue was
made by Charles Muller of Tokyo University
who has made it freely available on the internet
(Muller, 2004).
2
The project has been a part of
the Electronic Cultural Atlas Initiative (ECAI)
and received support from that group’s
Atlas of
Chinese Religions
research funded by the Luce
Foundation. This atlas is being constructed in
collaboration with the GIS Center at Academia
Sinica in Taiwan and will provide references to
the place names associated with the production
of the translations and compilations included
in the canon. Continued research on developing
the software is being done in cooperation
with the School of Creative Media and the
Department of Chinese Translations Linguistics
at City University of Hong Kong. It is important
to understand that no project of this kind could
possibly be undertaken without these multiple
and widespread collaborations.
In the example being described in this
presentation, we use the software to focus on
the digital version of the 13
th
century Korean
printing block edition of the Buddhist canon
(Lancaster, 1996).
3
The canon, represented
on blocks, contains more than 52 million
characters/glyphs carved onto 166,000 surfaces
each producing a page of text when printed. The
number of lines, containing up to 14 glyphs,
on the plates number over three million. The
entire set of the canon is divided into 1,514
different texts representing dated translations
and compilations made over a period of seven
centuries. The size of the data, the temporal span
of its composition, and the history of acquisition
in Korea of the hundreds of texts from China,
provide us with a reasonable challenge for the
interface design.
The previous approach to the study of this
canon was the traditional analytical one of
close reading of specific examples of texts
followed by a search through a defined corpus
for additional examples. When confronted with
166,000 pages, such activity had to be limited.
As a result, analysis was made without having a
full picture of the use of target words throughout
the entire collection of texts. That is to say, our
scholarship was often determined and limited
by externalities such as availability, access, and
size of written material. In order to overcome
these problems, scholars tended to seek for a
reduced body of material that was deemed to be
important by the weight of academic precedent.
In the current digital age, however, the
limits on “what can be considered” in the
Korean Buddhist canon have been significantly
removed. We can consider all of the texts,
all of the words, and all of the metadata
in every search. Consequently, the practices
of traditional scholarship for the canon have
begun to falter. When the entire canon had
been digitized in the last decade of the 20
th
century, the process of search and retrieval
of target words and phrases was transformed.
Nonetheless, problems remain for Buddhist
scholars using this digital version. In many
cases, the menu which appears after a search
of a term can contain thousands of references.
The references presented as a display of each
line where the word occurs can still occupy long
hours of time to analyze and put into some form
of presentation.
We are in need of new ways to display
search results that will allow scholars to
quickly perceive such things as the patterns
of occurrences, examples of clustering, view
of target words with adjacent companion
words, graphic models of profiles of sequence,

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2010
"Cultural expression, old and new"

Hosted at King's College London

London, England, United Kingdom

July 7, 2010 - July 10, 2010

142 works by 295 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (still needs to be added)

Conference website: http://dh2010.cch.kcl.ac.uk/

Series: ADHO (5)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None