view article Article Follow the White Rabbit: Using Embeddings So You Never Get Lost in Translation dacorvo • Feb 23 • 8
view article Article How to deploy and fine-tune DeepSeek models on AWS +1 pagezyhf, jeffboudier, dacorvo • Jan 30, 2025 • 55
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers sayakpaul, dacorvo • Jul 30, 2024 • 68
view article Article Quanto: a PyTorch quantization backend for Optimum +1 dacorvo, ybelkada, marcsun13 • Mar 18, 2024 • 45
view article Article Hugging Face Text Generation Inference available for AWS Inferentia2 philschmid, dacorvo • Feb 1, 2024 • 5
view article Article Make your llama generation time fly with AWS Inferentia2 dacorvo • Nov 7, 2023 • 1
view article Article Make your llama generation time fly with AWS Inferentia2 dacorvo • Nov 7, 2023 • 1