DHTech April Meetup

Rebecca will talk about her work on version four of the Princeton Geniza Project (PGP). Over the course of a two-year research partnership with Marina Rustow and the Princeton Geniza Lab, they de-siloed data (metadata and transcription text), improved researcher workflow, designed and built a new search interface, implemented a new tool (annotorious-tahqiq) for creating and editing transcriptions that is designed for RTL languages from the start; incorporated IIIF images from a variety of different institutions, have preliminary dataset exports planned to be used for eventual dataset publication. Technical challenges include: working with data from a long-running project (PGP dates back to the 80s); mixed scripts and bidirectional text (Hebrew, Arabic, Judaeo-Arabic); dates from different historical calendars; displaying text and image together when both are optional; transcription workflow and data format, etc.

You can register for the meetup here.