Small Languages, Big Models: A Study of Continual Training on Languages of Norway
Abstract
A new three-stage continual training approach enhances the performance and inference efficiency of a 11.4 billion parameter generative language model for Norwegian and Northern S'ami languages.
Training large language models requires vast amounts of data, posing a challenge for less widely spoken languages like Norwegian and even more so for truly low-resource languages like Northern S\'ami. To address this issue, we present a novel three-stage continual training approach that substantially improves the downstream performance together with the inference efficiency for the target languages. Based on our findings, we train, evaluate, and openly release a new generative language model for Norwegian Bokmal, Nynorsk, and Northern S\'ami with 11.4 billion parameters: NorMistral-11B.
Get this paper in your agent:
hf papers read 2412.06484 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 1
Collections including this paper 0
No Collection including this paper