Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,932

Full-text search

Active filters: vllm

coolroman/affine-1-5FHPjm5fA4AGPGuYdE3jVg7u2Av5KP23G8ELDaNgF8MiNB2p

24B • Updated 2 days ago • 52

coolroman/affine-M1-5DywXVktuAoubVGfgiG64fZAZ449oMqxMMWpyQFd4K86jvwr

125B • Updated 2 days ago • 42

alexleun/Mistral-Nemo-Instruct-2407-Q4_K_M-GGUF

12B • Updated 2 days ago • 23

alexleun/Mistral-Nemo-Base-2407-Q4_K_M-GGUF

12B • Updated 2 days ago • 35

Daerun/Ministral-3-3B-Reasoning-2512-Q4_K_M-GGUF

3B • Updated 2 days ago • 21

MuXodious/Mistral-Nemo-Instruct-2407-absolute-heresy

12B • Updated 1 day ago • 37

JongYeop/Llama-3.1-8B-Instruct-INT8-W8A8

8B • Updated 2 days ago • 21

JongYeop/Llama-3.1-8B-Instruct-INT8-W8A8-Dynamic-Per-Token

8B • Updated 1 day ago • 12

JongYeop/Llama-3.1-8B-Instruct-FP8-W8A8-Dynamic-Per-Token

8B • Updated 1 day ago • 13

akaashrp/Ministral-3-3B-Instruct-2512-BF16-q4f16_1-MLC

4B • Updated about 21 hours ago • 24

akaashrp/Ministral-3-3B-Base-2512-q4f16_1-MLC

4B • Updated about 17 hours ago • 25

Tomhn/Voxtral-Mini-4B-Realtime-2602

Updated about 4 hours ago