909 32

Stoney Kang

sikang99

AI & ML interests

Remote Control based on Vision

Recent Activity

upvoted a paper about 2 hours ago

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

upvoted a paper about 2 hours ago

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

upvoted a paper about 2 hours ago

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

View all activity

Organizations

upvoted 4 papers about 2 hours ago

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

Paper • 2606.02031 • Published 3 days ago • 12

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

Paper • 2606.02404 • Published 3 days ago • 50

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

Paper • 2606.03159 • Published 1 day ago • 7

PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps

Paper • 2606.01788 • Published 3 days ago • 7

upvoted 4 papers 1 day ago

ESPO: Early-Stopping Proximal Policy Optimization

Paper • 2605.29860 • Published 7 days ago • 15

SurGe: Improved Surface Geometry in Point Maps

Paper • 2605.31577 • Published 6 days ago • 4

VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies

Paper • 2605.30011 • Published 7 days ago • 8

Count Anything

Paper • 2605.30846 • Published 6 days ago • 8

upvoted 4 papers 2 days ago

Task-Focused Memorization for Multimodal Agents

Paper • 2605.31075 • Published 6 days ago • 29

WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction

Paper • 2605.29341 • Published 7 days ago • 14

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Paper • 2605.30350 • Published 7 days ago • 10

VLM3: Vision Language Models Are Native 3D Learners

Paper • 2605.30561 • Published 7 days ago • 20

upvoted 3 papers 4 days ago

PANDO: Efficient Multimodal AI Agents via Online Skill Distillation

Paper • 2605.24785 • Published 9 days ago • 9

When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems

Paper • 2605.30102 • Published 7 days ago • 13

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Paper • 2605.29801 • Published 7 days ago • 139

upvoted 3 papers 5 days ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published 7 days ago • 132

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Paper • 2605.30263 • Published 7 days ago • 54

AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation

Paper • 2605.28655 • Published 8 days ago • 11

upvoted 2 papers 6 days ago

Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems

Paper • 2605.26302 • Published 10 days ago • 31

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 8 days ago • 70

Stoney Kang

AI & ML interests

Recent Activity

Organizations

sikang99's activity