-
-
-
-
-
-
Inference Providers
Active filters: rl
Text Generation
• 4B • Updated
• 107
• 2
mradermacher/X-Coder-RL-Qwen3-8B-GGUF
8B • Updated
• 368
• 1
mradermacher/X-Coder-RL-Qwen3-8B-i1-GGUF
8B • Updated
• 1.41k
• 2
mradermacher/Clado-BrowserOS-Action-GGUF
Reinforcement Learning
• 4B • Updated
• 725
• 2
mradermacher/Clado-BrowserOS-Action-i1-GGUF
Reinforcement Learning
• 4B • Updated
• 4.37k
• 2
mradermacher/StarPO-4B-GGUF
Reinforcement Learning
• 4B • Updated
• 894
• 1
mradermacher/StarPO-4B-i1-GGUF
Reinforcement Learning
• 4B • Updated
• 3.98k
• 1
Reinforcement Learning
• Updated
d-byrne/snake-v1_training_state
Updated
InstaDeepAI/jumanji-benchmark-a2c-BinPack-v2
Updated
InstaDeepAI/jumanji-benchmark-a2c-CVRP-v1
ContextualAI/archangel_sft_pythia1-4b
Text Generation
• 1B • Updated
• 5
ContextualAI/archangel_sft_pythia2-8b
Text Generation
• 3B • Updated
• 4
• 1
ContextualAI/archangel_sft_pythia6-9b
Text Generation
• 7B • Updated
ContextualAI/archangel_sft_pythia12-0b
Text Generation
• 12B • Updated
• 7
ContextualAI/archangel_sft_llama7b
Text Generation
• 7B • Updated
• 6
• 1
ContextualAI/archangel_sft_llama13b
Text Generation
• 13B • Updated
• 3
ContextualAI/archangel_sft_llama30b
Text Generation
• 33B • Updated
• 3
ContextualAI/archangel_slic_llama30b
Text Generation
• 33B • Updated
• 5
ContextualAI/archangel_slic_pythia1-4b
Text Generation
• 1B • Updated
• 1
ContextualAI/archangel_slic_pythia2-8b
Text Generation
• 3B • Updated
• 5
ContextualAI/archangel_slic_pythia6-9b
Text Generation
• 7B • Updated
• 4
ContextualAI/archangel_slic_pythia12-0b
Text Generation
• 12B • Updated
• 1
ContextualAI/archangel_slic_llama7b
Text Generation
• 7B • Updated
• 5
• 1
ContextualAI/archangel_slic_llama13b
Text Generation
• 13B • Updated
• 4
ContextualAI/archangel_dpo_pythia1-4b
Text Generation
• 1B • Updated
• 2
ContextualAI/archangel_dpo_pythia2-8b
Text Generation
• 3B • Updated
• 1
ContextualAI/archangel_dpo_pythia6-9b
Text Generation
• 7B • Updated
• 1
ContextualAI/archangel_dpo_pythia12-0b
Text Generation
• 12B • Updated
• 2
ContextualAI/archangel_dpo_llama7b
Text Generation
• 7B • Updated
• 2