A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency Paper • 2505.01658 • Published May 3, 2025 • 39 • 5
Running 79 Maintain the unmaintainable 📚 79 Explore the complex relationships between 400+ machine learning models
Running 3.74k The Ultra-Scale Playbook 🌌 3.74k The ultimate guide to training LLM on large GPU Clusters
FilledVaccum/Llama-3.1-Sherkala-8B-Chat-Quantized-8-Bits Text Generation • 8B • Updated Oct 4, 2025 • 3
FilledVaccum/Llama-3.1-Sherkala-8B-Chat-Quantized-8-Bits Text Generation • 8B • Updated Oct 4, 2025 • 3