4 22

Salman Rahman PRO

salmannyu

https://salmanrahman.net/

AI & ML interests

Natural Language Processing, Deep Learning, Scalable Oversight, and Language Model Evaluation

Recent Activity

authored a paper 7 days ago

When Can LLMs Learn to Reason with Weak Supervision?

upvoted a collection 20 days ago

rlvr-weak-supervision

upvoted a paper 27 days ago

When Can LLMs Learn to Reason with Weak Supervision?

View all activity

Organizations

authored a paper 7 days ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 28 days ago • 25

upvoted a collection 20 days ago

rlvr-weak-supervision

Collection

Models from "When Can LLMs Learn to Reason with Weak Supervision?" — Llama-3.2-3B with continual pre-training and Thinking SFT. • 3 items • Updated 27 days ago • 2

upvoted a paper 27 days ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 28 days ago • 25

submitted a paper to Daily Papers 27 days ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 28 days ago • 25

updated a collection 27 days ago

rlvr-weak-supervision

Collection

Models from "When Can LLMs Learn to Reason with Weak Supervision?" — Llama-3.2-3B with continual pre-training and Thinking SFT. • 3 items • Updated 27 days ago • 2

updated a model 27 days ago

pavelslab-nyu/Llama-3.2-3B-ThinkSFT

3B • Updated 27 days ago • 27

published a model 27 days ago

pavelslab-nyu/Llama-3.2-3B-ThinkSFT

3B • Updated 27 days ago • 27

updated a model 27 days ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math-ThinkSFT

3B • Updated 27 days ago • 26

published a model 27 days ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math-ThinkSFT

3B • Updated 27 days ago • 26

updated a model 27 days ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math

3B • Updated 27 days ago • 19

published a model 27 days ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math

3B • Updated 27 days ago • 19

upvoted a paper about 1 month ago

CoDaS: AI Co-Data-Scientist for Biomarker Discovery via Wearable Sensors

Paper • 2604.14615 • Published Apr 16 • 8

updated a model about 1 month ago

salmannyu/llama_base_thinking_sft_noisy_reward_0_9

Updated Apr 15

published a model about 1 month ago

salmannyu/llama_base_thinking_sft_noisy_reward_0_9

Updated Apr 15

updated a model about 1 month ago

salmannyu/llama_base_thinking_sft_majority_vote_math_1024_sample_8k

Updated Apr 12

published a model about 1 month ago

salmannyu/llama_base_thinking_sft_majority_vote_math_1024_sample_8k

Updated Apr 12

updated a model about 2 months ago

salmannyu/mid_train_llama_52b_thinking_data_effect_math_8_sample

Updated Mar 30

published a model about 2 months ago

salmannyu/mid_train_llama_52b_thinking_data_effect_math_8_sample

Updated Mar 30

Salman Rahman PRO

AI & ML interests

Recent Activity

Organizations

salmannyu's activity