KletterMix: Climbing Toward High-Quality German Pretraining Data Paper • 2606.03773 • Published 19 days ago • 21
KletterMix: Climbing Toward High-Quality German Pretraining Data Paper • 2606.03773 • Published 19 days ago • 21
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 10 days ago • 166
Evaluation-Suite Collection Multilingual Evaluation Suite supporting 21 European Languages • 15 items • Updated Jan 8