metadata
title: README
emoji: π
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
short_description: Phrase-level segmentation and alignment for medieval texts.
ProMeTEXT
ProMeTEXT β the Centre for PROcessing MEdieval TEXTs β develops datasets, models and tools for the computational study of medieval and historical texts.
Our work focuses on phrase-level segmentation, multilingual alignment, and the processing of medieval textual traditions across Romance languages, Latin, and Middle English.
Resources
- Aquilign β a multilingual aligner for historical and philological corpora.
- Aquilign Multilingual Segmenter β a Hugging Face model for phrase-level segmentation of historical texts.
- Aquilign Explorer β a demo app for demonstrating multilingual alignment workflows.
- Multilingual Segmentation Dataset β gold-standard segmentation data for medieval prose.
- Parallel Alignment Corpora β multilingual aligned corpora used for fine-tuning LaBSE and evaluating multilingual alignment across historical textual traditions.