Sebastian Dziadzio

sbdzdz

·

https://sebastiandziadzio.com/

AI & ML interests

Computer vision, continual learning, compositionality.

Recent Activity

upvoted a paper 3 days ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

updated a dataset 7 months ago

sbdzdz/pbe-world

published a dataset 7 months ago

sbdzdz/pbe-world

View all activity

Organizations

upvoted a paper 3 days ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

Paper • 2606.32034 • Published 5 days ago • 10

updated a dataset 7 months ago

sbdzdz/pbe-world

Viewer • Updated Nov 26, 2025 • 100 • 10

published a dataset 7 months ago

sbdzdz/pbe-world

Viewer • Updated Nov 26, 2025 • 100 • 10

upvoted a paper over 1 year ago

Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published Feb 6, 2025 • 32

authored a paper over 1 year ago

ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities

Paper • 2412.06745 • Published Dec 9, 2024 • 6

New activity in open-llm-leaderboard/open_llm_leaderboard about 2 years ago

Detailed results are inconsistent

#734 opened about 2 years ago by

Detailed results are inconsistent

#734 opened about 2 years ago by