AI & ML interests

Historical Media Analysis and Enrichment

Recent Activity

adrienjourneΒ  updated a model about 13 hours ago
impresso-project/ner-hipe2020-hist-base
adrienjourneΒ  published a model about 13 hours ago
impresso-project/ner-hipe2020-hist-base
adrienjourneΒ  updated a model about 15 hours ago
impresso-project/ner-hipe2020-hist-medium
View all activity

Organization Card

Impresso - Media Monitoring of the Past is an interdisciplinary research project using machine learning to transform how historical media are processed, enriched, explored, and studied across modalities, languages, time periods, and national borders.

We develop the πŸš€ Impresso Web App and the πŸ”¬ Impresso Datalab, providing access to a large multilingual corpus of historical newspapers and radio broadcasts.

πŸ€– Models and πŸ“š datasets
  • πŸ€– Impresso models for historical multilingual documents, including language identification, OCR quality assessment, topic inference, NER, and NEL.
  • πŸ“š Impresso datasets curated from digitized historical media sources for ML development and evaluation. Upcoming releases include NER and NEL benchmarks from the HIPE evaluation campaign, an image type classification dataset, and more.
πŸ›οΈ Partners and funding

Impresso gratefully acknowledges the continued support of its cultural heritage partners, as well as funding from the SNSF (Grant Nos. CRSII5_173719 and CRSII5_213585) and the FNR (Grant No. 17498891).