Conceptual Modeling of Similarities and Duplication in Large-Scale Digital Libraries

poster / demo / art installation
  1. 1. Maggie Ryan

    University of Denver

  2. 2. Lindsay Gypin

    University of Denver

  3. 3. Krystyna K. Matusiak

    University of Denver

  4. 4. Benjamin M. Schmidt

    New York University

  5. 5. Peter Organisciak

    University of Denver

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Large-scale digital libraries, such as the HathiTrust Digital Library (HTDL) and the Internet Archive have emerged consortially, collecting works from institutions around the world. This has led to unevenly biased duplication: some works recur many times in the collections, while others may only have one copy. The Massive Text Lab at the University of Denver is researching levels of ‘sameness’ and duplication of works within these digital libraries through massive-scale analysis. We will discuss applications to modern cataloging standards and provide an overview of the issue and intricacies of duplication, the solutions the project is pursuing, and the value that our work provides in framing material relationships for future humanities scholarship.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at Data for this conference were initially prepared and cleaned by May Ning.

Conference website:


Series: ADHO (15)

Organizers: ADHO