rubricreward/mR3-Qwen3-14B-tgt-prompt-tgt-thinking-translated
Text Generation
• 15B • Updated
• 3
rubricreward/mR3-Qwen3-8B-tgt-prompt-tgt-thinking-translated
Text Generation
• 8B • Updated
• 3
rubricreward/mR3-Qwen3-4B-tgt-prompt-tgt-thinking-translated
Text Generation
• 4B • Updated
• 6
rubricreward/mR3-Qwen3-14B-tgt-prompt-tgt-thinking
Text Generation
• 15B • Updated
• 1
rubricreward/mR3-Qwen3-8B-tgt-prompt-tgt-thinking
Text Generation
• 8B • Updated
• 2
rubricreward/mR3-Qwen3-4B-tgt-prompt-tgt-thinking
Text Generation
• 4B • Updated
• 7
rubricreward/mR3-Qwen3-4B-tgt-prompt-en-thinking
Text Generation
• 4B • Updated
• 5
rubricreward/mR3-Qwen3-8B-tgt-prompt-en-thinking
Text Generation
• 8B • Updated
• 5
rubricreward/mR3-Qwen3-14B-tgt-prompt-en-thinking
Text Generation
• 15B • Updated
• 3
rubricreward/mR3-Qwen3-14B-en-prompt-en-thinking
Text Generation
• 15B • Updated
• 29
• 1
rubricreward/mR3-Qwen3-4B-en-prompt-en-thinking
Text Generation
• 4B • Updated
• 72
• 2
rubricreward/mR3-Qwen3-8B-en-prompt-en-thinking
Text Generation
• 8B • Updated
• 25
rubricreward/mR3-gpt-oss-20b-en-prompt-en-thinking
335k • Updated
• 1
rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-14B-LoRA-4k
Text Generation
• Updated
rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-8B-14k
Text Generation
• Updated
• 4
rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-4B-14k
Text Generation
• Updated
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-LoRA-4k
15B • Updated
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-LoRA-14k
15B • Updated
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-14k
Text Generation
• 15B • Updated
• 5
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-4k
Text Generation
• 15B • Updated
• 2
rubricreward/R3-Phi-4-reasoning-plus-LoRA-14k
15B • Updated
• 2
rubricreward/R3-Qwen3-14B-LoRA-14k
15B • Updated
rubricreward/R3-Qwen3-8B-LoRA-14k
Text Generation
• 8B • Updated
• 2
• 2
rubricreward/R3-Qwen3-4B-LoRA-14k
4B • Updated
• 1
rubricreward/R3-Qwen2.5-7B-LoRA-4k
8B • Updated
• 1
rubricreward/R3-Qwen2.5-7B-LoRA-14k
8B • Updated
• 1
rubricreward/R3-Qwen2.5-7B-14k
8B • Updated
• 1
rubricreward/R3-Qwen2.5-7B-4k
8B • Updated
• 1
rubricreward/R3-Qwen3-14B-LoRA-Random-Filter1
Updated
rubricreward/R3-Qwen3-14B-LoRA-Preference-Only-v1.1
15B • Updated
• 1