·
AI & ML interests
Machine learning, RLHF
Organizations
weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition
Viewer
• Updated
• 5k • 8
weqweasdas/ultrafeedback_binarized_processed
Viewer
• Updated
• 61.1k • 2
weqweasdas/qwen7b_prompt_difficult
Viewer
• Updated
• 15.7k • 7
weqweasdas/qwen7b_openr1_with_scores_sub
Viewer
• Updated
• 57.7k • 4
weqweasdas/qwen7b_openr1_with_scores_filtered_0375
Viewer
• Updated
• 24.3k • 6
weqweasdas/qwen7b_openr1_with_scores
Viewer
• Updated
• 75k • 4
weqweasdas/from_default_filtered_openr1_with_scores_filtered_05_and_filtered_allwrong
Viewer
• Updated
• 25k • 8
Viewer
• Updated
• 1.68k • 43
weqweasdas/dapo_with_scores
Viewer
• Updated
• 13k • 5
weqweasdas/dapo_and_openr1_can_be_evaluated_by_daporm_deduplicate_with_scores
Viewer
• Updated
• 34.1k • 2
weqweasdas/dapo_and_openr1_can_be_evaluated_by_daporm_deduplicate
Viewer
• Updated
• 34.1k • 5
weqweasdas/test_rm_from_default_filtered_openr_math_verify_scores_and_dapo_scores
Viewer
• Updated
• 93.7k • 8
weqweasdas/test_rm_from_default_filtered_openr_math_verify_scores
Viewer
• Updated
• 93.7k • 7
weqweasdas/from_default_filtered_openr1_with_scores_filtered_0125_but_not_all_wrong
Viewer
• Updated
• 13.3k • 38
weqweasdas/from_default_filtered_openr1_with_scores
Viewer
• Updated
• 75k • 6
weqweasdas/from_default_filtered_openr1_with_scores_filtered_025
Viewer
• Updated
• 45.5k • 2
weqweasdas/from_default_filtered_openr1_with_scores_filtered_0125
Viewer
• Updated
• 37.8k • 4
weqweasdas/from_default_filtered_openr1_with_scores_filtered_05
Viewer
• Updated
• 56.2k • 4
weqweasdas/from_default_filtered_openr1
Viewer
• Updated
• 75k • 79
weqweasdas/aime_hmmt_brumo_cmimc_amc23
Viewer
• Updated
• 230 • 57
weqweasdas/aime_hmmt_brumo_cmimc
Viewer
• Updated
• 190 • 2
weqweasdas/filtered_openr1
Viewer
• Updated
• 145k • 3
weqweasdas/numina_prompt_non_dedu
Viewer
• Updated
• 312k • 3
Viewer
• Updated
• 66 • 2
Viewer
• Updated
• 66 • 5
weqweasdas/qwen7b_self_rewarding_sft_with_score_passn
Viewer
• Updated
• 500 • 2
weqweasdas/qwen7b_base_with_score_passn
Viewer
• Updated
• 500 • 1
weqweasdas/qwen7b_grpo_ver2_step80_with_score_passn_second_64
Viewer
• Updated
• 1k • 4
weqweasdas/qwen7b_grpo_ver2_step300_with_score_passn
Viewer
• Updated
• 1k • 3
weqweasdas/qwen7b_grpo_ver2_step200_with_score_passn
Viewer
• Updated
• 1k • 2