sumail's picture

sumail

sumailmao

·

chongqichuizi875

AI & ML interests

None yet

Recent Activity

authored a paper about 16 hours ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

commentedon a paper 1 day ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

updated a collection 1 day ago

Flow-DPPO: GenEval2

View all activity

Organizations

authored a paper about 16 hours ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 3 days ago • 41

commented a paper 1 day ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 3 days ago • 41 •

updated a collection 1 day ago

Flow-DPPO: GenEval2

Flow-DPPO-trained LoRA adapters (single- and multi-reward) for SD3.5 and FLUX.2-klein-9B optimized on GenEval2. • 5 items • Updated 1 day ago

upvoted a paper 2 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 3 days ago • 41

upvoted a paper 10 months ago

Understanding Tool-Integrated Reasoning

Paper • 2508.19201 • Published Aug 26, 2025 • 32