Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 9 days ago • 169
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 16 days ago • 145
Monitoring the Internal Monologue: Probe Trajectories Reveal Reasoning Dynamics Paper • 2605.18549 • Published 12 days ago • 2
DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models Paper • 2605.15055 • Published 16 days ago • 19
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 23 days ago • 231
ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue Paper • 2605.01371 • Published 28 days ago • 6
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
CLEAR: Unlocking Generative Potential for Degraded Image Understanding in Unified Multimodal Models Paper • 2604.04780 • Published Apr 6 • 10
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published Apr 9 • 101
SciLT: Long-Tailed Classification in Scientific Image Domains Paper • 2604.03687 • Published Apr 4 • 8