Domain-Specific Data Synthesis for LLMs via Minimal Sufficient Representation Learning Paper • 2605.30039 • Published 14 days ago • 18
ML-Embed: Inclusive and Efficient Embeddings for a Multilingual World Paper • 2605.15081 • Published 29 days ago • 11
Beyond Retrieval: A Multitask Benchmark and Model for Code Search Paper • 2605.04615 • Published May 6 • 23
QuitoBench: A High-Quality Open Time Series Forecasting Benchmark Paper • 2603.26017 • Published Mar 27 • 31
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published Mar 19 • 34
C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling Paper • 2512.21332 • Published Dec 24, 2025 • 17
CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects Paper • 2509.14856 • Published Sep 18, 2025 • 2