Command Palette

Search for a command to run...

Page Inspect

https://www.swebench.com/
Internal Links
16
External Links
14
Images
11
Headings
3

Page Content

Title:SWE-bench Leaderboards
Description:
HTML Size:210 KB
Markdown Size:2 KB
Fetched At:November 17, 2025

Page Structure

h1Leaderboards
h2News
h2Acknowledgements

Markdown Content

SWE-bench Leaderboards

SWE-bench

SWE-bench

- Leaderboards
- Benchmarks
- SWE-bench
- SWE-bench Lite
- SWE-bench Multilingual
- SWE-bench Multimodal
- SWE-bench Bash Only
- SWE-bench Verified
- About
- Paper
- Blog
- Docs
- Contact
- Citations
- Press
- Submit
- SWE-bench Family
- SWE-agent
- mini-SWE-agent
- SWE-smith
- SWE-ReX
- SWE-bench CLI



# Leaderboards

Bash Only Verified Lite Full Multimodal

Filters:

Open Scaffold ▼

All Tags ▼



SWE-bench **Bash Only** uses the SWE-bench Verified dataset with the mini-SWE-agent environment for all models \[Post\].
SWE-bench **Lite** is a subset curated for less costly evaluation \[Post\].
SWE-bench **Verified** is a human-filtered subset \[Post\].
SWE-bench **Multimodal** features issues with visual elements \[Post\].

Each entry reports the **% Resolved** metric, the percentage of instances solved (out of 2294 Full, 500 Verified & Bash Only, 300 Lite, 517 Multimodal).

Analyze Results in Detail

## News

- \[07/2025\] mini-SWE-agent achieves up to 65% on SWE-bench Verified in 100 lines of python code. \[Link\]
- \[05/2025\] Our new paper SWE-smith is out! Train your own models for software engineering agents. \[Link\]
- \[03/2025\] SWE-agent 1.0 is the open source SOTA on SWE-bench Lite! \[Link\]
- \[10/2024\] Introducing **SWE-bench Multimodal**! \[Link\]
- \[08/2024\] SWE-bench x OpenAI = **SWE-bench Verified** \[Report\]
- \[06/2024\] **Docker**\-ized SWE-bench for easier evaluation \[Report\]
- \[03/2024\] Check out **SWE-agent** (12.47% on SWE-bench) \[Link\]
- \[03/2024\] Released **SWE-bench Lite** \[Report\]

## Acknowledgements

We thank the following institutions for their generous support: Open Philanthropy, AWS, Modal, Andreessen Horowitz, OpenAI, and Anthropic.

© 2025 SWE-bench Team. All rights reserved.

GitHub HuggingFace Paper