TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Publication
Under Review at NeurIPS 2025