airlsyn

11 75 207

AI & ML interests

AI & RL

Recent Activity

upvoted a paper 3 days ago

Trust Region On-Policy Distillation

liked a model 4 days ago

openbmb/MiniCPM-RobotManip

liked a model 4 days ago

openbmb/MiniCPM-RobotTrack

View all activity

Organizations

upvoted a paper 3 days ago

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published May 31 • 48

liked 2 models 4 days ago

openbmb/MiniCPM-RobotManip

Robotics • 2B • Updated 1 day ago • 408 • 160

openbmb/MiniCPM-RobotTrack

Robotics • 0.4B • Updated 4 days ago • 306 • 115

liked a model 8 days ago

thinkingmachines/Inkling

Image-Text-to-Text • 952B • Updated 3 days ago • 24.7k • • 1.49k

liked a model 19 days ago

RedHatAI/GLM-5.2-speculator.dspark-preview

Text Generation • 4B • Updated 16 days ago • 1.41k • 59

liked a model 20 days ago

AliesTaha/fable-traces

Text Generation • 4B • Updated 19 days ago • 5.48k • • 209

upvoted a paper 20 days ago

CausalMix: Data Mixture as Causal Inference for Language Model Training

Paper • 2607.01104 • Published 23 days ago • 21

liked a dataset 22 days ago

a-m-team/AM-Thinking-v1-RL-Dataset

Viewer • Updated May 21, 2025 • 54.8k • 166 • 19

upvoted an article 23 days ago

Article

GLM-5.2: Built for Long-Horizon Tasks

zai-org

•

Jun 17

• 135

liked 2 datasets 24 days ago

Team-ACE/ToolACE

Viewer • Updated Sep 4, 2024 • 11.3k • 9.42k • 191

OpenCoder-LLM/opc-sft-stage1

Viewer • Updated Nov 24, 2024 • 4.22M • 1.59k • 75

liked a dataset 25 days ago

ByteDance-Seed/Code-Contests-Plus

Viewer • Updated Nov 6, 2025 • 49.2k • 6.94k • 68

liked a dataset 27 days ago

amd/ReasonLite-Dataset

Viewer • Updated Jan 22 • 6.16M • 687 • 16

upvoted a collection 29 days ago

Tmax

Collection

Data and models associated with "Tmax: A simple recipe for terminal agents". paper: https://arxiv.org/abs/2606.23321 • 23 items • Updated about 1 month ago • 18

upvoted an article about 1 month ago

Article

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

qgallouedec

•

Dec 4, 2025

• 73

liked a dataset about 1 month ago

nvidia/Nemotron-Pretraining-Code-v3

Viewer • Updated Jun 4 • 146M • 2.83k • 59

liked a model about 1 month ago

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4

Text Generation • 335B • Updated 29 days ago • 297k • • 256

liked a dataset about 1 month ago

nvidia/Nemotron-SFT-OpenCode-v1

Preview • Updated Mar 23 • 2.76k • 57

upvoted a collection about 1 month ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 7 days ago • 181

liked a Space about 2 months ago

The ultimate guide to RL environments: building and scaling them in the LLM era

📝

201

Building and scaling RL environments for LLM training