Ashish Talreja
talrejaa8
·
AI & ML interests
None yet
Organizations
None yet
Rl/GRPO
-
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
Paper • 2507.05687 • Published • 30 -
Perception-Aware Policy Optimization for Multimodal Reasoning
Paper • 2507.06448 • Published • 48 -
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
Paper • 2507.16331 • Published • 22 -
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Paper • 2511.04570 • Published • 216
LoRA
Rl/GRPO
-
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
Paper • 2507.05687 • Published • 30 -
Perception-Aware Policy Optimization for Multimodal Reasoning
Paper • 2507.06448 • Published • 48 -
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
Paper • 2507.16331 • Published • 22 -
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Paper • 2511.04570 • Published • 216
models
0
None public yet
datasets
0
None public yet