Digital Humanities Tech Symposium - Agenda

by DHTech · Jun 4, 2025 · Event date: Mon Jul 14, 2025 at 1:30 PM , Lisbon, Portugal · announcement conference whatshappening event

Digital Humanities Tech Symposium at DH2025 - Agenda

From July 14-18, 2025, DH2025 will be held at NOVA University in Lisbon, Portugal. DHTech will hold a mini-conference at DH2025, the Digital Humanities Tech Symposium. Typical DH conference presentations are focussed on the research with a slight nod to the technical details; we want to flip that format and dive more deeply into the technical aspects of the work, while still keeping it in context of the research and domain specifics.

Agenda - Monday, July 14th, 1:30pm to 6:30pm

Session 1 - Moderator: Robert Casties
1:30-1:40pm	DHTech Steering Committee Introduction
1:40-2:00pm	Andreas Wagner (remote) TEI2Zenodo TEI2Zenodo acts as a server that accepts TEI files and uploads them to the Zenodo data repository. It is meant to be part of a CI/CD pipeline, but can also be used in other ways. It goes beyond the already existing GitHub-Zenodo integration by arranging for individual files to be deposits instead of copies of whole git repositories. The presentation will describe the handling of DOI identifiers that are being created in the Zenodo upload process, ways of using the server besides CI/CD, and need for further development: cleaning up code and adding important further functions.
2:00-2:20pm	Timo Frühwirth tei-rdfa: A Python Utility for Extracting RDFa Data from TEI-XML Documents The tei-rdfa Python package extracts RDF data embedded in TEI-XML documents via RDFa. Handling native TEI (Text Encoding Initiative) namespace declaration through elements, this utility aims to fill a gap left by existing RDFa parsers. The tool presentation will demonstrate the package's key features and error handling capabilities for DH researchers working with TEI+RDFa.
2:20-2:40pm	Gregor Middell Turning an XML Database Inside Out The presentation of the DWDS' dictionary writing system, which serves as the backend of a German online dictionary accessed by 2-3 million users each month, will highlight the architectural choices and challenges encountered during its required refactoring.
2:40-2:50pm	Break

Session 2 - Moderator: Julia Damerow
2:50-3:10pm	Robert Casties There and back again - how to preserve your data during migrations Our data often needs to be migrated - from a foreign format into the database, from one database system into another, or from a dying system into an archive format. What can we do to make sure that no data is lost in the processs? I will present some approaches from hard-won experience, from end-to-end statistics to bookkeeping conversions to full round-trip migration and comparison.
3:10-3:30pm	Benjamin Kiessling When Automatic Text Recognition doesn't work and how to fix it Automatic Text Recognition is widely used in the Digital Humanities but certain materials and scholarly practices are not well served by current methods. A gander through the principal technical causes of these deficiencies and how current research trends in the Machine Learning exacerbate them will be completed by a short presentation of a text recognition tool that aims to address them.
3:30pm-3:50pm	Coffee Break

Session 3 - Moderator: Jose Hernandez
3:50-4:10pm	Rebecca Koeser Undate in Action Undate is an ambitious, in-progress effort to develop a pragmatic Python package for computation and analysis of temporal information in humanistic and cultural data, with a particular emphasis on uncertain, incomplete, or imprecise dates and with support for multiple calendars. Undate draws on and improves implementations and data modeling from digital humanities projects from multiple different institutions. We propose a “Tool Presentation” of Undate, using an interactive code notebook to demonstrate current functionality and capabilities of this library. The demonstration would introduce Undate and UndateInterval objects, and show how they can be initialized directly with numbers or strings for dates with unknown digits, or by parsing dates written out in a supported calendar, and can be used for comparison and calculations, including sorting, comparing precision, determining whether one date or date interval falls within or overlaps another, and calculating durations of dates and intervals.
4:10-4:30pm	Paul Girard Historical data visual exploration meets static web technologies In this talk I will present how we created a visual exploration website to publish the [REG⋅ARTS dataset](https://regarts.huma-num.fr/) by using static web technologies. The REG⋅ARTS datasets gathers the transcriptions of students registrations from the École des beaux arts de Paris between 1813 and 1968. To publish it we designed a static website which still offers state of the art exploration features such as a faceted search engine, projections on historical maps and network visualisation without using any server nor external APIs.
4:30-4:50pm	Olivia Wikle From Metadata to Static Site: A Technical Demonstration of CollectionBuilder for Digital Exhibits This tool demonstration will introduce CollectionBuilder (https://collectionbuilder.github.io/), an open-source framework built on Jekyll for generating static, metadata-driven digital exhibits. It will walk through the technical workflow of creating a basic site by integrating CSV metadata, digital asset files, YAML configuration, and Markdown content, then illustrate customization options such as swapping the default image viewer for a IIIF viewer. The session will touch on the framework’s modular code structure, use of embedded open-source libraries for interactivity, and approaches to local development, deployment, and long-term maintenance.
4:50-5:10pm	Moritz Mähr, Moritz Twente One Template to Rule Them All: Interactive Research Data Documentation with Quarto We introduce the Open Research Data Template, a GitHub-based framework designed to streamline the publication and reuse of open research data through executable, interactive documentation using Quarto. By integrating narrative, metadata, and multi-programming-language code (Python, R, Julia, ObservableJS) into cohesive websites, the template lowers barriers to meaningful reuse and sustainable archiving of research workflows. We will demonstrate the template's structure, automation pipeline, and real-world applications through projects such as DigiHistCH24, Stadt.Geschichte.Basel, DHBern, and Decoding Inequality 2025.
5:10pm-5:20pm	Break

Session 4 - Moderator: Jeffrey Tharsen
5:20-5:40pm	Jamie Folsom Extending Recogito Studio with Plugins Recogito Studio is a new open source platform for annotation of TEI-XML Text, IIIF images and manifests and PDFs. While the software is focused on real-time collaboration, user and document management, and import and export of documents and annotations in standard formats, some adopters have needs that go beyond those core features. This talk is an introduction to the Recogito Studio plugin framework and software development kit, which makes it easy for developers to add new functionality to the software without modifying the core codebase.
5:40-6:00	Jose Hernandez The QuantumRandomWalks package and its use for quantum link prediction in historical citation networks This presentation will walk users through using the QuantumRandomWalks package for quantum link prediction on historical citation networks. It will provide a humanities-friendly intro to Qiskit and its features for developers that may want to build upon our work.
6:00-6:20pm	Tibor Kálmán Clouds for Crowds - Implementing federated AAI for the Digital Humanities With the increase in data-driven research, Research Infrastructures such as the DARIAH need to ensure secure access to the data, tools and workflows they offer. This presentation aims to highlight the necessity and advantages of implementing federated identity management and authorisation; describes the technological background of such an AAI solution in the humanities and motivates the DH-Tech community to adopt the AARC Blueprint Architecture supported by a Compendium being developed in the context of the AARC-TREE project.
6:20pm-6:30pm	DHTech Steering Committee Goodbye and Thank You