Catherine Arnett

catherinearnett

AI & ML interests

multilingual NLP, tokenization

Recent Activity

published a dataset 26 days ago
catherinearnett/bilingual-tokenizer-training-data
liked a dataset about 1 month ago
commoncrawl/CommonLID
View all activity

Organizations

Blog-explorers's profile picture Language and Cognition Lab (UCSD)'s profile picture Common Crawl Foundation's profile picture Beetles's profile picture