Ma Qianyi's picture

Ma Qianyi

alexandergi

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning

liked a dataset 16 days ago

princeton-nlp/SWE-bench_Verified

liked a model 18 days ago

chilkersion/Z1os

View all activity

Organizations

None yet

upvoted a paper 16 days ago

Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning

Paper • 2606.01682 • Published 19 days ago • 7

upvoted a paper 23 days ago

Seeing the Needle in the Haystack: Towards Weakly-Supervised Log Instance Anomaly Localization via Counterfactual Perturbation

Paper • 2605.10988 • Published May 9 • 3

upvoted a paper 27 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published May 20 • 206

upvoted a paper about 1 month ago

Recovering Hidden Reward in Diffusion-Based Policies

Paper • 2605.00623 • Published May 1 • 4

upvoted 5 papers 2 months ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 247

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 264

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 327

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 507

ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation

Paper • 2604.03922 • Published Apr 5 • 53

upvoted 5 papers 3 months ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 352

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published Mar 30 • 343

Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments

Paper • 2603.23638 • Published Mar 24 • 11

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published Mar 17 • 312

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 211

upvoted 3 papers 4 months ago

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 198

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 87

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 525