Pre-Generation Hallucination Detection in Large Language Models via Soft-Target Attention Probing Paper • 2606.21917 • Published 15 days ago
Multimodal Evaluation of Russian-language Architectures Paper • 2511.15552 • Published Nov 19, 2025 • 79
3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark Paper • 2504.13861 • Published Mar 26, 2025 • 3