RULER Datasets
Nathan Habib PRO
AI & ML interests
Evals
Recent Activity
new activity
2 days ago
jdopensource/JoyAI-LLM-Flash:Add evaluation results for GPQA-Diamond, MMLU-Pro
liked
a model
2 days ago
jdopensource/JoyAI-LLM-Flash
new activity
2 days ago
Qwen/Qwen3.5-397B-A17B:Add evaluation results for HLE, MMLU-Pro