The AXE Tool Suite: Tagging Across Time and Space

Authorship
  1. 1. Doug Reside

    Maryland Institute for Technology and Humanities (MITH) - University of Maryland, College Park

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Since the codification of the Text Encoding Initiative
standards in the mid-1990s, the process of the creation of
digital editions and archives is largely one of "marking up"
existing texts in XML. Originally, this was often done "by
hand." Scholars would add XML tags to existing text documents
using a text editor-a tedious process often fraught with errors.
Over the last decade several tools have been produced which
make this process somewhat more efficient and accurate, though
most still require more than a beginners familiarity with XML
encoding, and few are open source. Moreover, many digital
humanities projects have, of late, become far more
multi-medial--relying on image, video, and audio files as well
as text. Existing markup tools have only begun to work with
these non-textual artifacts. As digital archives continue to grow,
the markup tools used to encode them must become more
flexible and easier to use. The Ajax XML Encoder (AXE),
developed at the Maryland Institute for Technology in the
Humanities (MITH) is intended to be the tool suite that meets
this need.
AXE is a free tool suite that will allow users with limited
technical skills to deeply tag text, images, video, and audio files
for inclusion in digital archives. The program combines and
extends the functionality of proprietary online tools such as
YouTube and Flickr in a (mostly) open source1, web-based
platform for scholarly use. The tagging tool is designed in
AJAX and Macromedia Flash and generates and uses a MySQL
database. Like the mythological Ajax's axe, MITH's AXE
provides users with enough power and flexibility to accomplish
their tasks without a great deal of assistance from higher powers
(the technical or professorial "gods" graduate student workers
must often invoke when using earlier encoding tools).
Users of this tool are divided into two classes: managing editors
and editor-users. Managing editors are permitted to define the
sorts of tags all users can use and are able to remove items from
the database. All users, however, are able to add new
multimedia content to the database, tag it, and search for
preexisting content in the database. Tagging is naturally
accomplished in different ways depending on the media. For
sound files, an mp3 file is played through a Flash plug-in. When
the user pauses a sound file once, the timing is recorded. When
the user pauses the file again, the end time is recorded and the
user is presented with a web-based form to tag the selection. If
appropriate, a textual transcript of the sound with corresponding
times can be made. Providing a tool for image tagging was
more difficult. Although popular websites like Myspace and
Facebook provide some models for folksonomic image tagging,
the tag-spaces are usually limited to regular polygons (like
squares) and do not store the resulting data in a form that can
be easily shared. A better model for image tagging is the
ARCHway Project's "Image Tagger," described in the June
2006 issue of the International Journal on Digital Libraries by
designers Dekhtyar, Iacob, Jaromczyk, Kiernan, Moore, and
Porter . The program is hindered, though, by the clumsy
Java-based interface that requires users to install and learn
specialized software. Our program uses an AJAX website which
allows the user to add points to an image map (represented on
the image itself via a 1 pixel div element with a red border)
[see figure 1]. Once the image map is drawn, the user can
describe it with tags defined by the editor (or editors) of the
project. The entire image can also be so described
Figure 1
Video tagging, as one might expect, uses a combination of the
image and audio tagging methods. The video is first converted
to flash movie format using FFMpeg. The audio portion of the
movie is handled in exactly the same way as stand alone audio.
The visual portion of the movie is handled with a combination
of the techniques used for images and sound. The user can tag,
for instance, the start and end times of a particular segment and
add metadata to this portion. AXE even allows users to tag
images within the frames of movie files. If the user wished, for
instance, to tag a tree in the background of a shot, the user first
marks the time period in which the appropriate image appears.
A user-defined number of the frames from this segment are
then stacked and rendered translucent. With some adjustments,
the user can then click a series of coordinates which define the
selection over the space of the segment. This process is rendered
through Javascript and Macromedia's Flash.
Once all the tags have been created, they are parsed into a
mySQL database (the XML is generated first to allow users to
work offline and then feed the XML to the database later). This database can be searched through traditional keyword or tag
cloud searches, but the interface also allows guided browsing.
When the user views a multimedia object, the user is also
presented with a series of "related" objects (after the
e-commerce model in which customers are presented with a
series of products related to one they are currently viewing). If
the user selects a sub-element in the document (perhaps
something like a tree in the background of a photograph), the
"related" documents will change to reflect relationships centered
on the user's choice. The user can also set global limits to what
suggestions are made. The related documents will change to
reflect user selection of sub-elements. The user can set global
limits on what relations are presented (e.g. related documents
must be dated after the current document). New tags and
documents from remote sites can also be added to the database
by editor-users. Later users can decide whose tags they trust
and block those by anonymous or unreliable taggers.
AXE is currently being used in a project headed by Angel David
Nieves to create a multi-perspective history of a 1976 student
uprising in Soweto, South Africa. Dr. Nieves has an enormous
wealth of multimedia primary documents surrounding the event
and hoped that the mass of materials would provide new ways
of telling the story. We needed, however, to process and present
this material in an unbiased way that allowed for multiple
interpretations of the events. AXE is being used to create an
interface which, as much as possible, left the arrangement and
interpretation of the materials up to the individual users.
AXE may eventually prove essential in the creation of digital
library archives. Although academic libraries are now fairly
adept at the digital preservation of textual material, few libraries
provide searchable digital archives of sound and video. As the
average available bandwidth of both users and institutions
increase, software will be needed to allow the cataloging and
access of this material. AXE seeks to fill this need.
1. Although all code written at MITH will be published on the web,
the use of Adobe's Flash prevents the project from being truly
open source.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2007

Hosted at University of Illinois, Urbana-Champaign

Urbana-Champaign, Illinois, United States

June 2, 2007 - June 8, 2007

106 works by 213 authors indexed

Series: ADHO (2)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None