2 23 78

Masoud Hashemi

masoudhashemi

AI & ML interests

None yet

Recent Activity

liked a dataset 4 days ago

inclusionAI/AReaL-tau2-data

liked a Space 27 days ago

aminediroHF/trainer-generator-bf16-mismatch

liked a Space about 1 month ago

AdithyaSK/rl-environments-guide

View all activity

Organizations

upvoted an article about 2 months ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego

•

Mar 10

• 160

upvoted 2 papers 2 months ago

Terminal Agents Suffice for Enterprise Automation

Paper • 2604.00073 • Published Mar 31 • 96

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published Mar 25 • 98

upvoted an article 2 months ago

Article

A New Framework for Evaluating Voice Agents (EVA)

ServiceNow-AI

•

Mar 24

• 95

upvoted a paper 3 months ago

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

Paper • 2603.13594 • Published Mar 13 • 149

upvoted an article 6 months ago

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

ServiceNow-AI

•

Dec 9, 2025

• 84

upvoted a paper 8 months ago

Apriel-Nemotron-15B-Thinker

Paper • 2508.10948 • Published Aug 13, 2025 • 6

upvoted an article 8 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

A-Mahla, merve, sergiopaniego, reach-vb, lewtun

•

Sep 23, 2025

• 138

upvoted a collection 8 months ago

Apriel-1.5-15B-Thinker

Collection

3 items • Updated Oct 2, 2025 • 76

upvoted an article 9 months ago

Article

Gaia2 and ARE: Empowering the community to study agents

clefourrier, gregmialz, mlcu, mortimerp9, XciD, tfrere, evijit, RomainFroger, dheeraj7596, CarolinePascal, upiter

•

Sep 22, 2025

• 135

upvoted a paper 9 months ago

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs

Paper • 2509.08031 • Published Sep 9, 2025 • 21

upvoted a paper 10 months ago

Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21, 2025 • 68

upvoted an article 11 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 778

upvoted a collection 12 months ago

MiniMax-M1

Collection

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated Apr 15 • 119

upvoted a collection about 1 year ago

General-Reasoner

Collection

Advancing LLMs' general reasoning capabilities • 9 items • Updated Oct 12, 2025 • 6

upvoted 2 articles about 1 year ago

Article

Selective fine-tuning of Language Models with Spectrum

anakin87

•

Sep 3, 2024

• 36

Article

Open R1: Update #3

open-r1

•

Mar 11, 2025

• 297

upvoted an article over 1 year ago

Article

The N Implementation Details of RLHF with PPO

vwxyzjn, tianlinliu0121, lvwerra

•

Oct 24, 2023

• 72

upvoted an article almost 2 years ago

Article

BigCodeBench: The Next Generation of HumanEval

terryyz, ganler, SivilTaram, huybery, Muennighoff, dpfried, harmdevries, lvwerra, clefourrier

•

Jun 18, 2024

• 54

upvoted a collection about 2 years ago

[lecture artifacts] aligning open language models

Collection

artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17, 2024 • 58

Masoud Hashemi

AI & ML interests

Recent Activity

Organizations

masoudhashemi's activity

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

A New Framework for Evaluating Voice Agents (EVA)

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Smol2Operator: Post-Training GUI Agents for Computer Use

Gaia2 and ARE: Empowering the community to study agents

SmolLM3: smol, multilingual, long-context reasoner

Selective fine-tuning of Language Models with Spectrum

Open R1: Update #3

The N Implementation Details of RLHF with PPO

BigCodeBench: The Next Generation of HumanEval