Aspects of the Interoperability in the Digital Humanities A Case Study in Buddhist Studies

poster / demo / art installation
Authorship
  1. 1. Kiyonori Nagasaki

    Yamaguchi Prefectural University

  2. 2. A. Charles Muller

    University of Tokyo

  3. 3. Masahiro Shimoda

    University of Tokyo

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

1. Introduction
I
n considering cases of interoperability in the Digital
Humanities, it might be useful to focus on an example
from Buddhist Studies for three reasons: The first is usage
of the multilingual Buddhist canons that were written
in Pali, Sanskrit and translated into Chinese, Tibetan,
and so on before the 10th century. Thus, a system had to
be established that allowed one to deal with resources
that are composed not only by various families of written
characters but also various languages. The second is that
they are significant as resources in history, linguistics,
and other academic fields because they include not only
Buddhist ideas but also information related to the ancient
world of which Buddhism was a part. Thus, interoperability
in this field is needed in order to synthesize such
related fields. The third reason to focus on this case study
is that digitization projects in the field Buddhist Studies
are in progress worldwide, having both academic and
proselytizing purposes. Moreover, many projects house
sub-projects within: for example, one’s aim might be to
merely retrieve information on a local computer; or, on
the other hand, one might want to publish critical editions
of digitized canonical texts on the Web. Discussion
of this need for interoperability has already begun in
the organization known as the Electronic Buddhist Text
Initiative [EBTI]. We will discuss some aspects of that
through our case study that follows.
2. A Case Study: SAT and the DDB
In this chapter, we will discuss the interoperability between
two different projects in which we are engaged:
the DDB (Digital Dictionary of Buddhism, http://www.
buddhism-dict.net/ddb/ ) and the SAT (SAT Daizōkyō
Text Database Committee, http://21dzk.l.u-tokyo.ac.jp/
SAT/ ). The DDB is a web-serviced lexicon that includes
over 45,000 entries. The SAT project has digitized a
scholarly edition of the Chinese Buddhist canon, consisting of approximately 150 million Chinese characters
in eighty-five volumes. It was compiled and edited by
Japanese scholars in the Taishō Era and has been treated
as de facto standard text in the field of Buddhist study
since then.
2.1. The DDB: The Digital Dictionary of
Buddhism
The DDB was developed for the study of Buddhist texts
written in classical Chinese and other East Asian languages
that include Chinese character-based terminology.
This project was initiated in 1986 by Charles Muller,
a specialist in East Asian Buddhism. In 1995, with the
advent of the Internet, Muller converted his data set into
HTML format, and placed it on the Web. During this period
from 1996-2000, the storage format of the dictionaries
was changed to SGML, and then to XML. In 2001,
with the help of humanities computing guru Michael
Beddow, the dictionaries were reset on the web in XML
format with a search engine, and this structure remains
in place down to the present day. The DDB features Buddhist
terms, texts, schools, temples, and persons. Entries
range in scope from short glossary type, to full-length
encyclopedic articles. Now supported by more than sixty
collaborators with specialist's expertise in a wide range
of areas in Buddhist studies, the expansion rate of the
DDB has been exponential in recent years.
A special dimension of the DDB is its usage of XML
attributes to accredit contributors for their work at the
level of entry sub-areas (XML "nodes") rather than only
at the level of full entries, as seen in standard printed
works. Furthermore, the relatively accessible XML tag
structure (based loosely on the TEI model) has made it
possible to integrate and interlink the DDB with other
lexicons (such as EDict), and external text databases,
such as the SAT text database.
2.2. SAT: The SAT Daizōkyō Text Database
Committee
SAT is managed by the SAT Daizōkyō Text Database
Committee (directed at present by Masahiro Shimoda).
The database depends on an XML-like legacy scheme
which was designed in 1998 and superficially represents
the pages of the edition. The textual corpus was digitized
and corrected mainly by about 200 young Buddhist
scholars during a period of about ten years. While descriptive
markup has not yet been fully implemented, the
locations of passages in the source texts, such as page,
paragraph and line were precisely recorded so that the
traditional methodology of the Buddhist study could be
referred to transparently. It has been posted on the Web
since April 2008.
2.3. The Interoperation between the DDB
and the SAT
When the SAT Web service was started, it provided some
interoperation with other related projects. The most important
service provided is the search function for the
entries of the DDB. The function adopts AJAX so that
users can retrieve items transparently on their Web
browsers. If users select a portion of the text with their
mouse devices, they can view all terms in the text contained
in the DDB. The service is convenient for those
who have interest in the text—especially beginners in
Buddhist studies. In addition, the SAT Web service distributes
some APIs. One of them is reference service
based on the physical location. It provides a function to
clip an arbitrary part of the texts by means of specifying
the location or the range in a certain URI. The DDB and
some other Web services adopt this API. The important
merit of this interoperation is not only the usability, but
also the management structure that allows each longterm
project to be sustained independently. It is so difficult
to manage a big digital project that sometimes it
may be disrupted, crash or even disappear. Although we
may wish to increase our services, they often become
unwieldy, even ending up in abandonment of the project.
Enhancing Web services in order to support researchers,
the interoperation with the other projects will also be a
workable alternative in the field of Buddhist studies.
3. Other Examples of Interoperation
SAT interoperates with some other projects such as
CHISE and INBUDS. CHISE is an ontology mainly
focusing on Chinese characters distributed under GNU
GPL. SAT adopted CHISE to serve as a thesaurus of Chinese
characters in order to support its retrieval system.
INBUDS is a bibliographical database for the study of
Indian philosophy and Buddhism. It is maintained by
the Japanese Association of Indian and Buddhist Studies
and includes 60,000 records that have been collected for
twenty years. SAT implemented the interoperation with
the INBUDS so that users could refer to the related academic
resources. Some of the resources in the INBUDS
are distributed as digital data.
4. Some Problems in Interoperability
As discussed previously, in the field of the Buddhist
studies, interoperability is quite efficient in some ways.
On the other hand, all-too-common problems of interoperability
are also found in the field. One of the important
issues is that of organizational sustainability. If one
side of the organization stops distribution of their own
resources, the interoperation would end. However, it
can be argued from another perspective that this is actually an advantage, because it allows ready awareness
of systematic changes, allowing managers to adapt as
necessary, for example, when one needs to salvage distributed
data. Indeed, SAT has recently begun to support
INBUDS because, after establishing the interoperation,
it became clear that the programmers of INBUDS had
been facing some problems with managing their data for
some time. Therein, especially, in the case of a personal
project, it is more secure to establish interoperation.
5. Conclusion
Just as “the fourth generation collections” in the field
of the Western classics, digitizing projects for Buddhist
studies are gradually shifting their own styles to the next
generation which puts emphasis on interoperability. Interoperability
not only exposes the problems inherent in
the activities of the digitization of Buddhist studies, but
also shows the ways to solve those problems. The same
model will eventually hold true for the Humanities in
general.
References
Muller, A. Charles. (2008). EBTI After 15 and CBETA
after 10 Years: Joint International Conference on Digital
Buddhist Studies, Chair's Report.
http://buddhism-dict.net/ebti/ebti2008report.html
(accessed 12 November 2008).
Muller, A. Charles. (2008). The Digital Dictionary of
Buddhism [DDB]: Present Status and Future Developments,
The Ninth Annual Symposium for Scholars
Resident in Japan, March 2008. http://www.acmuller.
net/articles/ddb-nichibunken-200803.html (accessed 12
November 2008).
Crane, G. (2008). Fourth Generation Collections: TEI,
FRBR, and Canonical Text Services, TEI Member’s
Meeting 2008, Nov 2008. http://www.cch.kcl.ac.uk/
cocoon/tei2008/programme/abstracts/abstract-160.html
(accessed 12 November 2008).
Rehm, G. and Witt, A. (2008). Aspects of Sustainability
in Digital Humanities, Digital Humanities 2008 , June
2008: 21-29.
Nagasaki, K. and Shimoda, M. (2008). Outline of the
Activities of the SAT Project, Joint International Conference
on Digital Buddhist Studies, at Dharma Drum
Buddhist College, February 2008: 22-23.
Nagasaki, K. (2008). A Collaboration System for the
Philology of the Buddhist Study”, Digital Humanities
2008: 262-263.
MORIOKA, T. (2006). Character processing based on
character ontology, IPSJ Technical Report, 2006-CH-
072: 25-32.
Eide, Ø., Ore, C. and Holmen, J. (2008). Sustainability
in Cultural Heritage Management, Digital Humanities
2008 : 22-23.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2009

Hosted at University of Maryland, College Park

College Park, Maryland, United States

June 20, 2009 - June 25, 2009

176 works by 303 authors indexed

Series: ADHO (4)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None