tomg-group-umd 's Collections Retrofitting Recurrence
updated
Teaching Pretrained Language Models to Think Deeper with Retrofitted
Recurrence
Paper
• 2511.07384
• Published • 19
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation
• 1B • Updated • 581
• 1
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation
• 1B • Updated • 29
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation
• 1B • Updated • 424
smcleish/Recurrent-Llama-3.2-train-recurrence-4
Text Generation
• 1B • Updated • 83
smcleish/Recurrent-TinyLlama-3T-train-recurrence-32
Text Generation
• 0.8B • Updated • 201
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-16
Text Generation
• 0.8B • Updated • 4
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-8
Text Generation
• 0.8B • Updated • 5
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4
Text Generation
• 0.8B • Updated • 4
smcleish/Recurrent-OLMo-2-0425-train-recurrence-32
Text Generation
• 1B • Updated • 247
• 2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-16
Text Generation
• 1B • Updated • 7
smcleish/Recurrent-OLMo-2-0425-train-recurrence-8
Text Generation
• 1B • Updated • 33
smcleish/Recurrent-OLMo-2-0425-train-recurrence-4
Text Generation
• 1B • Updated • 6
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-single-phase
Text Generation
• 0.8B • Updated • 3
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-two-phase
Text Generation
• 0.8B • Updated • 7
smcleish/Recurrent-Llama-3.2-untrained
Text Generation
• 1B • Updated • 84
smcleish/Recurrent-TinyLlama-3T-untrained
Text Generation
• 0.8B • Updated • 7
smcleish/Recurrent-OLMo-2-0425-untrained
Text Generation
• 1B • Updated • 4
smcleish/Recurrent-Llama-3.2-2-4-2-untrained
Text Generation
• 1B • Updated • 9
• 1
smcleish/retrofitting-llama-fineweb-edu-tokenized
Viewer
• Updated • 332M • 382