Digital Humanities Tech Symposium - Agenda
Digital Humanities Tech Symposium at DH2025 - Agenda
From July 14-18, 2025, DH2025 will be held at NOVA University in Lisbon, Portugal. DHTech will hold a mini-conference at DH2025, the Digital Humanities Tech Symposium. Typical DH conference presentations are focussed on the research with a slight nod to the technical details; we want to flip that format and dive more deeply into the technical aspects of the work, while still keeping it in context of the research and domain specifics.
Agenda - Monday, July 14th, 1:30pm to 6:30pm
Session 1 - Moderator: tbd | ||
---|---|---|
1:30-1:40pm | DHTech Steering Committee Introduction | |
1:40-2:00pm | Andreas Wagner (remote) TEI2Zenodo TEI2Zenodo acts as a server that accepts TEI files and uploads them to the Zenodo data repository. It is meant to be part of a CI/CD pipeline, but can also be used in other ways. It goes beyond the already existing GitHub-Zenodo integration by arranging for individual files to be deposits instead of copies of whole git repositories. The presentation will describe the handling of DOI identifiers that are being created in the Zenodo upload process, ways of using the server besides CI/CD, and need for further development: cleaning up code and adding important further functions. | |
2:00-2:20pm | Timo Frühwirth tei-rdfa: A Python Utility for Extracting RDFa Data from TEI-XML Documents The tei-rdfa Python package extracts RDF data embedded in TEI-XML documents via RDFa. Handling native TEI (Text Encoding Initiative) namespace declaration through elements, this utility aims to fill a gap left by existing RDFa parsers. The tool presentation will demonstrate the package's key features and error handling capabilities for DH researchers working with TEI+RDFa. | |
2:20-2:40pm | Gregor Middell
Turning an XML Database Inside Out The presentation of the DWDS' dictionary writing system, which serves as the backend of a German online dictionary accessed by 2-3 million users each month, will highlight the architectural choices and challenges encountered during its required refactoring. | |
2:40-2:50pm | Break |
Session 2 - Moderator: tbd | ||
---|---|---|
2:50-3:10pm | Robert Casties There and back again - how to preserve your data during migrations Our data often needs to be migrated - from a foreign format into the database, from one database system into another, or from a dying system into an archive format. What can we do to make sure that no data is lost in the processs? I will present some approaches from hard-won experience, from end-to-end statistics to bookkeeping conversions to full round-trip migration and comparison. | |
3:10-3:30pm | Benjamin Kiessling When Automatic Text Recognition doesn't work and how to fix it Automatic Text Recognition is widely used in the Digital Humanities but certain materials and scholarly practices are not well served by current methods. A gander through the principal technical causes of these deficiencies and how current research trends in the Machine Learning exacerbate them will be completed by a short presentation of a text recognition tool that aims to address them. | |
3:30pm-3:50pm | Coffee Break |
Session 3 - Moderator: tbd | ||
---|---|---|
3:50-4:10pm | Rebecca Koeser Undate in Action Undate is an ambitious, in-progress effort to develop a pragmatic Python package for computation and analysis of temporal information in humanistic and cultural data, with a particular emphasis on uncertain, incomplete, or imprecise dates and with support for multiple calendars. Undate draws on and improves implementations and data modeling from digital humanities projects from multiple different institutions. We propose a “Tool Presentation” of Undate, using an interactive code notebook to demonstrate current functionality and capabilities of this library. The demonstration would introduce Undate and UndateInterval objects, and show how they can be initialized directly with numbers or strings for dates with unknown digits, or by parsing dates written out in a supported calendar, and can be used for comparison and calculations, including sorting, comparing precision, determining whether one date or date interval falls within or overlaps another, and calculating durations of dates and intervals. | |
4:10-4:30pm | Paul Girard Historical data visual exploration meets static web technologies In this talk I will present how we created a visual exploration website to publish the [REG⋅ARTS dataset](https://regarts.huma-num.fr/) by using static web technologies. The REG⋅ARTS datasets gathers the transcriptions of students registrations from the École des beaux arts de Paris between 1813 and 1968. To publish it we designed a static website which still offers state of the art exploration features such as a faceted search engine, projections on historical maps and network visualisation without using any server nor external APIs. | |
4:30-4:50pm | Olivia Wikle From Metadata to Static Site: A Technical Demonstration of CollectionBuilder for Digital Exhibits This tool demonstration will introduce CollectionBuilder (https://collectionbuilder.github.io/), an open-source framework built on Jekyll for generating static, metadata-driven digital exhibits. It will walk through the technical workflow of creating a basic site by integrating CSV metadata, digital asset files, YAML configuration, and Markdown content, then illustrate customization options such as swapping the default image viewer for a IIIF viewer. The session will touch on the framework’s modular code structure, use of embedded open-source libraries for interactivity, and approaches to local development, deployment, and long-term maintenance. | |
4:50-5:10pm | Moritz Mähr, Moritz Twente One Template to Rule Them All: Interactive Research Data Documentation with Quarto We introduce the Open Research Data Template, a GitHub-based framework designed to streamline the publication and reuse of open research data through executable, interactive documentation using Quarto. By integrating narrative, metadata, and multi-programming-language code (Python, R, Julia, ObservableJS) into cohesive websites, the template lowers barriers to meaningful reuse and sustainable archiving of research workflows. We will demonstrate the template's structure, automation pipeline, and real-world applications through projects such as DigiHistCH24, Stadt.Geschichte.Basel, DHBern, and Decoding Inequality 2025. | |
5:10pm-5:20pm | Break |
Session 4 - Moderator: tbd | ||
---|---|---|
5:20-5:40pm | Jamie Folsom Extending Recogito Studio with Plugins Recogito Studio is a new open source platform for annotation of TEI-XML Text, IIIF images and manifests and PDFs. While the software is focused on real-time collaboration, user and document management, and import and export of documents and annotations in standard formats, some adopters have needs that go beyond those core features. This talk is an introduction to the Recogito Studio plugin framework and software development kit, which makes it easy for developers to add new functionality to the software without modifying the core codebase. | |
5:40-6:00 | Jose Hernandez The QuantumRandomWalks package and its use for quantum link prediction in historical citation networks This presentation will walk users through using the QuantumRandomWalks package for quantum link prediction on historical citation networks. It will provide a humanities-friendly intro to Qiskit and its features for developers that may want to build upon our work. | |
6:00-6:20pm | Tibor Kálmán Clouds for Crowds - Implementing federated AAI for the Digital Humanities With the increase in data-driven research, Research Infrastructures such as the DARIAH need to ensure secure access to the data, tools and workflows they offer. This presentation aims to highlight the necessity and advantages of implementing federated identity management and authorisation; describes the technological background of such an AAI solution in the humanities and motivates the DH-Tech community to adopt the AARC Blueprint Architecture supported by a Compendium being developed in the context of the AARC-TREE project. | |
6:20pm-6:30pm | DHTech Steering Committee Goodbye and Thank You |