AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published 6 days ago • 43
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 46
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models Paper • 2402.14800 • Published Feb 22, 2024 • 3
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models Paper • 2405.16057 • Published May 25, 2024