chenzehao's picture

chenzehao

chhao

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

upvoted a paper 5 days ago

A Very Big Video Reasoning Suite

upvoted a paper 5 days ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

View all activity

Organizations

None yet

upvoted a paper 2 days ago

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Paper • 2602.23008 • Published 4 days ago • 33

upvoted 2 papers 5 days ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published 7 days ago • 500

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published 19 days ago • 215

upvoted a paper 7 days ago

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published 21 days ago • 252

upvoted 2 papers 15 days ago

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Paper • 2602.07274 • Published 24 days ago • 205

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published 20 days ago • 236

upvoted 3 papers 20 days ago

Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization

Paper • 2506.17252 • Published Jun 8, 2025 • 2

Real-Time Aligned Reward Model beyond Semantics

Paper • 2601.22664 • Published Jan 30 • 13

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published 22 days ago • 272