蔡正舟's picture

8

蔡正舟

conctsai

·

AI & ML interests

None yet

Recent Activity

authored a paper 23 days ago

Learning to Self-Verify Makes Language Models Better Reasoners

authored a paper 23 days ago

Look Before You Leap: Autonomous Exploration for LLM Agents

authored a paper 23 days ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

View all activity

Organizations

None yet

upvoted a paper 24 days ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Paper • 2605.27141 • Published 26 days ago • 19

upvoted 6 papers about 1 month ago

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

Paper • 2605.13997 • Published May 13 • 5

Look Before You Leap: Autonomous Exploration for LLM Agents

Paper • 2605.16143 • Published May 15 • 9

Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards

Paper • 2605.14539 • Published May 14 • 7

Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation

Paper • 2605.11739 • Published May 13 • 59

Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding

Paper • 2605.02290 • Published May 4 • 42

Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR

Paper • 2605.15726 • Published May 15 • 34

upvoted a paper about 2 months ago

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 101