Search for a command to run...
SWE-bench is a standardized benchmark and leaderboards for evaluating the performance of AI agents on real-world software engineering tasks.