Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Inference Optimization

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

nm-research  updated a model about 5 hours ago
inference-optimization/gpt-oss-120b-from-qwen235b-then-self-ckpt3-speculator.eagle3
nm-research  published a model about 5 hours ago
inference-optimization/gpt-oss-120b-from-qwen235b-then-self-ckpt3-speculator.eagle3
krishnateja95  updated a model 3 days ago
inference-optimization/Qwen3-30B-A3B-Instruct-2507_7.0_bits_mode_heuristic
View all activity

Alexandre Marques's profile pictureMegan Flynn's profile pictureDipika's profile pictureKrishna Teja Chitty-Venkata's profile pictureHelen Zhao's profile pictureFynn Schmitt-Ulms's profile pictureNeural Magic Research's profile pictureChibueze Ukachi's profile pictureEldar Kurtić's profile pictureRahul Tuli's profile pictureKyle Sayers's profile pictureBrian Dellabetta's profile pictureLinghao Kong's profile pictureMichael Goin's profile pictureReed Meyerson's profile picture

inference-optimization 's models 216

inference-optimization/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor

71B • Updated Dec 4, 2025

inference-optimization/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head

71B • Updated Dec 4, 2025

inference-optimization/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor

71B • Updated Dec 4, 2025

inference-optimization/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head

71B • Updated Dec 4, 2025

inference-optimization/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor

8B • Updated Dec 4, 2025 • 35

inference-optimization/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor

8B • Updated Dec 4, 2025
  • Previous
  • 1
  • ...
  • 6
  • 7
  • 8
  • Next
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs