hangyu guo

Rosiness

·

AI & ML interests

Natural Language Processing

Recent Activity

upvoted a paper 3 days ago

DeepSearch-World: Self-Distillation for Deep Search Agents in a Verifiable Environment

upvoted a paper 11 days ago

Qwen-AgentWorld: Language World Models for General Agents

upvoted a paper 11 days ago

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

View all activity

Organizations

upvoted a paper 3 days ago

DeepSearch-World: Self-Distillation for Deep Search Agents in a Verifiable Environment

Paper • 2607.07820 • Published 17 days ago • 88

upvoted 2 papers 11 days ago

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published Jun 23 • 153

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

Paper • 2606.29526 • Published 27 days ago • 171

upvoted a paper about 2 months ago

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published Jun 5 • 75

upvoted 3 papers 2 months ago

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Paper • 2605.20025 • Published May 19 • 191

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published May 13 • 166

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

Paper • 2605.10832 • Published May 11 • 23

upvoted a paper 3 months ago

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Paper • 2604.18292 • Published Apr 20 • 89

updated a dataset 3 months ago

MM-R1-HH/envs_supply

Preview • Updated Apr 20 • 25

published a dataset 3 months ago

MM-R1-HH/envs_supply

Preview • Updated Apr 20 • 25

upvoted 4 papers 3 months ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 168

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published Apr 13 • 69

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Paper • 2604.07429 • Published Apr 8 • 123

Towards Long-horizon Agentic Multimodal Search

Paper • 2604.12890 • Published Apr 14 • 20

upvoted 6 papers 4 months ago

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Paper • 2603.25158 • Published Mar 26 • 56

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Paper • 2603.22117 • Published Mar 23 • 29

WorldCache: Content-Aware Caching for Accelerated Video World Models

Paper • 2603.22286 • Published Mar 23 • 5

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 110

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 141

XSkill: Continual Learning from Experience and Skills in Multimodal Agents

Paper • 2603.12056 • Published 24 days ago • 34