Leo's picture

Leo PRO

leideng

·

https://leideng.github.io/

AI & ML interests

Efficient AI, Sparse Attention

Recent Activity

liked a model about 15 hours ago

updated a collection 1 day ago

authored a paper 1 day ago

Extending Context Window of Large Language Models via Semantic Compression

View all activity

Organizations

None yet

upvoted 2 articles 3 days ago

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

LinkedIn

•

Jan 27

• 77

Article

Nanochat-Ascend: Training Karpathy's Nanochat on Ascend NPU (Part 1)

leideng

•

4 days ago

• 1

upvoted a paper 3 days ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 190

upvoted an article 18 days ago

Article

TRL v1.0: Post-Training Library Built to Move with the Field

+2

qgallouedec, stevhliu, pcuenq, sergiopaniego

•

Mar 31

• 54

upvoted a collection 27 days ago

Open Coding Agents

13 items • Updated Mar 5 • 53

upvoted an article 2 months ago

Article

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

dvgodoy

•

Feb 11, 2025

• 123

upvoted a collection 2 months ago

Gemma 4

15 items • Updated 6 days ago • 937

upvoted a collection 3 months ago

Transformers.js V4 demos

A collection of demos built with Transformers.js V4 • 24 items • Updated Apr 16 • 59

upvoted 2 papers 3 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26, 2025 • 78

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 105

upvoted 2 collections 3 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.67k

Olmo Hybrid

6 items • Updated Mar 5 • 27

upvoted a collection 4 months ago

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 23 items • Updated 2 days ago • 319

upvoted an article 4 months ago

Article

~Don't~ Repeat Yourself

patrickvonplaten

•

Apr 5, 2022

• 55

upvoted a collection over 1 year ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 309