Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
published
a dataset about 23 hours ago
mehuldamani/multi-answer-sft-target-dataset published
a model 2 days ago
mehuldamani/sfted_rlvr_multi__veryHardDataset_moreThinking updated
a dataset 2 days ago
mehuldamani/multi-answer-sft-target-dataset Organizations
None yet