BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published Apr 2 • 38 • 5
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published Apr 2 • 38 • 5
A Causal Language Modeling Detour Improves Encoder Continued Pretraining Paper • 2605.12438 • Published 13 days ago • 7 • 3
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models Paper • 1511.09249 • Published Nov 30, 2015 • 1 • 1
Decoding Text Spans for Efficient and Accurate Named-Entity Recognition Paper • 2604.20447 • Published Apr 22 • 2 • 2
TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation Paper • 2603.08182 • Published Mar 9 • 1 • 2
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning Paper • 2602.11149 • Published Feb 11 • 18 • 5
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published Jan 29 • 12 • 5
Bolmo: Byteifying the Next Generation of Language Models Paper • 2512.15586 • Published Dec 17, 2025 • 18 • 3
Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining Paper • 2511.21613 • Published Nov 26, 2025 • 2 • 1
Gaperon: A Peppered English-French Generative Language Model Suite Paper • 2510.25771 • Published Oct 29, 2025 • 17 • 2
Mask and You Shall Receive: Optimizing Masked Language Modeling For Pretraining BabyLMs Paper • 2510.20475 • Published Oct 23, 2025 • 1 • 2