·
AI & ML interests
LLM agents, evaluation & reasoning
Organizations
view article Launching Agent Leaderboard v2: The Enterprise-Grade Benchmark for AI Agents
published an article about 1 year ago view article Agent Leaderboard: Evaluating AI Agents in Multi-Domain Scenarios