Models of the Paper LogitRouter: a novel Attention variant for reducing Myopic Routing in Mixture of Experts
Felipe Rodríguez Bórquez PRO
feliperodriguezborquez
AI & ML interests
Architectures, pre-training, post-training
Recent Activity
liked a model 5 days ago
latam-gpt/Wayra-Perplexity-Estimator-55M liked a model 5 days ago
latam-gpt/Llama-3.1-70B-LatamGPT-SFT-1.0 published a model 6 months ago
feliperodriguezborquez/OLMoE-0924-my-V1Organizations
None yet