Data Downloads

We offer two categories of data download: a simple CSV that contains one row per work in this database, which is useful for some simple analyses; and a full data dump that contains multiple CSVs representing our data in its full complexity.

Twice yearly, both of these data downloads will be deposited to Carnegie Mellon University's institutional repository. Those versioned deposits are available at https://doi.org/10.1184/R1/12987959, and if used should be cited with the appropriate DOIs and datestamps. To cite the most recently updated version of the dataset, use the suggested citation below.

How to cite these data downloads:
Weingart, S.B., Eichmann-Kalwara, N., Lincoln, M., et al. "DH Conferences Data Extract" in The Index of Digital Humanities Conferences. Carnegie Mellon University, 2020. Data last updated 2024-03-19. https://dh-abstracts.library.cmu.edu. https://doi.org/10.34666/k1de-j489

Warning: We strongly discourage using Microsoft Excel to open these files, as that software is easily confused by the multiline values and UTF-8 characters contained in some columns. We recommend R, Python, OpenOffice, or Google Sheets.

Simple CSV

Download single CSV (zipped) (Last updated: 2024-03-19 2:22 a.m. EDT)

This file contains a simplified version of our database arranged with one row per "work" (be it a keynote, a paper, a panel session, etc.). Associated conference information such as name, location, and date, as well as author names and keyword/topic tags are included in each row as well.

This CSV does not contain more complex related information such as changing author names or detailed affiliation information. For that, you will want to look at the full relational database available below.

Simple data dictionary

Required fields marked with an asterisk (*)
Name Description
*work_id Unique ID number
*conference_label Conference label
conference_short_title Location-based short title of the conference
conference_theme_title Thematic conference title
*conference_year
conference_organizers Conference organizing groups (Separated by semicolon)
conference_series Conference series (separated by semicolon)
conference_hosting_institutions Hosting institutions (separated by a semicolon)
conference_city
conference_state
conference_country
conference_url URL for the conference program or abstracts
work_title Work title
work_url Direct URL for the work abstract if it exists
*work_authors Named authors (separated by a semicolon)
*work_type Type of work (e.g. keynote, multipaper session, poster
full_text The full text of the work, when indexed and licensed for republication. If the full text of the work has not yet been indexed into this system, OR if the text is not licensed for republication, this field will be blank.
full_text_type Work full text will either be available as plain text (txt) or as TEI (xml). When full text is not available to be shared, this field will be blank.
full_text_license License of this full text, when it is known.
parent_work_id ID of a multipaper session or panel session that this work belongs to
keywords Author-submitted keywords (separated by a semicolon)
languages Language(s) of the abstract (separated by a semicolon)
topics Topics from a conference-specific controlled vocabulary (separated by a semicolon)

Full Data

Download multiple CSVs (zipped) (Last updated: 2024-03-19 2:22 a.m. EDT)

This download contains one CSV for each of the core tables in our database, and can be used to do more complex analyses such as tracking institutional affiliations across many different years of conferences.

Full data dictionary

Required fields marked with an asterisk (*)
works.csv

A record for a single work such as a paper, keynote, or session.

Name Type Description
*id AutoField
*conference ForeignKey The conference where this abstract was submitted/published. (Related model: Conference)
*title CharField Abstract title
work_type ForeignKey Abstracts may belong to one type that has been defined by editors based on a survey of all the abstracts in this collection, e.g. "poster", "workshop", "long paper". (Related model: WorkType)
full_text TextField Full text content of the abstract, including references, but excluding authorship information.
full_text_type CharField Format of the full text (currently either plain text, or XML)
full_text_license ForeignKey License of this full text, when known (Related model: License)
url CharField URL where the full text of this specific abstract can be freely accessed
parent_session ForeignKey If this work was part of a multi-paper organized session, this is the entry for the parent session (Related model: Work)

authors.csv

A person who has authored at least one abstract in this database. All attributes of the author are established in the context of a given work, so authors have no inherent/immutable attributes beyond this unique identifier.

Name Type Description
*id AutoField

conferences.csv

A scholarly event with organized presentations, such as a conference, symposium, or workshop.

Name Type Description
*id AutoField
*year PositiveIntegerField Year the conference was held
short_title CharField A location-based short title for the conference
notes TextField Further descriptive information
url CharField Public URL for the conference and/or conference program
theme_title CharField Optional thematic title (e.g. 'Big Tent Digital Humanities')
start_date DateField YYYY-MM-DD
end_date DateField YYYY-MM-DD
city CharField City where the conference took place
state_province_region CharField State, province, or region where the conference was held
country ForeignKey (Related model: Country)
references TextField Citations to conference proceedings
contributors TextField Individuals or organizations who contributed data about this conference
attendance TextField Summary information about conference attendance, with source links
*entry_status CharField Have all the abstracts for this conference been entered?
*program_available BooleanField Is a program for this conference available in some format for editors to input?
*abstracts_available BooleanField Are the abstracts for this conference available in some format for editors to input?
search_text CharField Any searchable text that should lead to this conference
*label CharField General label for this object

conference_organizer.csv

Many-to-many relationships between conferences and their (potentially) multiple organizers.

Name Type Description
*id AutoField
*organizer ForeignKey (Related model: Organizer)
*conference ForeignKey (Related model: Conference)

conference_hosting_institution.csv

Many-to-many relationships between conferences and the (potentially) multiple institutions that host them.

Name Type Description
*id AutoField
*conference ForeignKey (Related model: Conference)
*institution ForeignKey (Related model: Institution)

conference_series.csv

A formalized series of multiple events, such as an annual conference or recurring symposium

Name Type Description
*id AutoField
*title CharField Full name
*abbreviation CharField Display abbreviation
notes TextField Discursive notes, generally concerning the history of this series

conference_series_membership.csv

Many-to-many relationships between conferences and the (potentially) multiple series they belong to.

Name Type Description
*id AutoField
*series ForeignKey (Related model: ConferenceSeries)
*conference ForeignKey (Related model: Conference)
number IntegerField Order of this conference within this series.

organizers.csv

An organizer of academic events, such as a scholarly association or academic center.

Name Type Description
*id AutoField
*name CharField
*abbreviation CharField
notes TextField
url CharField

authorships.csv

Each authorship describes the relationship of an author to a given work, establishing the authors' attributes as they gave them in the official program where the work was presented.

Name Type Description
*id AutoField
*author ForeignKey The author (Related model: Author)
*work ForeignKey The work authored. (Related model: Work)
*authorship_order PositiveSmallIntegerField Authorship order (1-based indexing)
*appellation ForeignKey The appellation given by the author when they submitted this particular work. (Related model: Appellation)

authorship_affiliation.csv

Many-to-many relationships between authorships and the (potentially) multiple affiliations given by authors for a given work

Name Type Description
*id AutoField
*authorship ForeignKey (Related model: Authorship)
*affiliation ForeignKey (Related model: Affiliation)

appellations.csv

A name belonging to an author

Name Type Description
*id AutoField
first_name CharField Surname and/or first and middle initials
last_name CharField Family name

institutions.csv

Institutions such as universities or research centers, with which authors may be affiliated.

Name Type Description
*id AutoField
*name CharField Institution name
city CharField City where the institution is located
state_province_region CharField State, province, or region where the institution is located
country ForeignKey Country where the institution is located (Related model: Country)

affiliations.csv

A sub-unit of an Institution, such as a center, department, library, etc.

Name Type Description
*id AutoField
department CharField The name of a department, center, or other subdivision of a larger institution
*institution ForeignKey The parent institution for this affiliation (Related model: Institution)

countries.csv

A controlled vocabulary of countries

Name Type Description
*id AutoField
*tgn_id CharField Canonical ID in the Getty Thesaurus of Geographic Names
*pref_name CharField Preferred label for the country sourced from the Getty TGN

keywords.csv

Author-supplied keywords describing the content of a work

Name Type Description
*id AutoField
*title CharField

works_keywords.csv

Many-to-many relationships between works and keywords

Name Type Description
*id AutoField
*work ForeignKey (Related model: Work)
*keyword ForeignKey (Related model: Keyword)

topics.csv

Conference-specific controlled vocabulary of topics

Name Type Description
*id AutoField
*title CharField

works_topics.csv

Many-to-many relationships between works and topics

Name Type Description
*id AutoField
*work ForeignKey (Related model: Work)
*topic ForeignKey (Related model: Topic)

languages.csv

Languages in which works are written

Name Type Description
*id AutoField
*title CharField
code CharField

works_languages.csv

Many-to-many relationships between works and languages

Name Type Description
*id AutoField
*work ForeignKey (Related model: Work)
*language ForeignKey (Related model: Language)

work_types.csv

Controlled vocabulary of work types, such as 'paper' or 'keynote'

Name Type Description
*id AutoField
*title CharField
*is_parent BooleanField Works of this type are considered multi-paper panels/sessions and may contain 'child' abstracts

licenses.csv

Licenses that may be applicable to full texts

Name Type Description
*id AutoField
*title CharField Full title of the license
*full_text TextField Full text of the license
*display_abbreviation CharField A short, identifiable abbreviation of the license
url CharField URL with a full description of the licnese
*default BooleanField Make this license the default license applied to any work whose conference has been set to show all full texts.