54 22 70

Ryan Marten

ryanmarten

https://ryanmarten.com

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

harborframework/parity-experiments:Add parity experiments for harveyai/lab

liked a dataset 16 days ago

open-thoughts/AgentTrove

updated a dataset 21 days ago

harborframework/terminal-bench-2.0

View all activity

Organizations

New activity in harborframework/parity-experiments 7 days ago

Add parity experiments for harveyai/lab

#250 opened 7 days ago by

ryanmarten

liked a dataset 16 days ago

open-thoughts/AgentTrove

Viewer • Updated 8 days ago • 1.7M • 9.56k • 138

updated a dataset 21 days ago

harborframework/terminal-bench-2.0

Benchmark • Updated 21 days ago • 7.3k • 31

published a dataset 25 days ago

harborframework/terminal-bench-3.0-lfs

Updated 25 days ago • 46

New activity in harborframework/parity-experiments 3 months ago

SpreadsheetBench adapter parity (claude-code + Haiku 4.5, 400 tasks × 3 trials)

#106 opened 3 months ago by

ryanmarten

New activity in harborframework/terminal-bench-2.0 3 months ago

Define 'harbor' as eval framework 🎉

#3 opened 3 months ago by

burtenshaw

Add an eval yaml to integrate this benchmark into Community Evals.

#1 opened 3 months ago by

burtenshaw

published a dataset 3 months ago

harborframework/terminal-bench-2.0

Benchmark • Updated 21 days ago • 7.3k • 31

liked a dataset 3 months ago

zai-org/terminal-bench-2-verified

Updated 7 days ago • 2.44k • 73

liked a dataset 5 months ago

open-thoughts/OpenThoughts-Agent-v1-SFT

Viewer • Updated Jan 27 • 15.2k • 2.58k • 91

updated a Space 5 months ago

README

🦀

liked a dataset 6 months ago

jupyter-agent/jupyter-agent-dataset

Viewer • Updated Sep 10, 2025 • 95.8k • 2.03k • 166

updated 2 datasets 9 months ago

ryanmarten/OpenThoughts-1k-sample

Viewer • Updated Aug 31, 2025 • 2k • 622k • 13

open-thoughts/OpenThoughts-114k

Viewer • Updated Aug 31, 2025 • 228k • 123k • 839

published a dataset 9 months ago

ryanmarten/OpenThoughts-1k-sample

Viewer • Updated Aug 31, 2025 • 2k • 622k • 13

liked a dataset 9 months ago

SWE-bench/SWE-smith-trajectories

Viewer • Updated Jul 19, 2025 • 76k • 3.5k • 60

liked a Space 11 months ago

OpenThoughts Benchmark Explorer

📊

Explore benchmark correlations and model performance

liked a model 11 months ago

open-thoughts/OpenThinker3-7B

Text Generation • 8B • Updated Jun 9, 2025 • 6.7k • • 135

updated 2 collections 11 months ago

Reasoning Models

Collection

53 items • Updated Jun 8, 2025 • 1

Reasoning Datasets

Collection

50 items • Updated Jun 8, 2025 • 11

Ryan Marten

AI & ML interests

Recent Activity

Organizations

ryanmarten's activity

Add parity experiments for harveyai/lab

SpreadsheetBench adapter parity (claude-code + Haiku 4.5, 400 tasks × 3 trials)

Define 'harbor' as eval framework 🎉

Add an eval yaml to integrate this benchmark into Community Evals.

README

OpenThoughts Benchmark Explorer