Skip to content

Command line basics

Command line basics

This tutorial walks you through running SWE-agent from the command line.

  • Please read our hello world tutorial before proceeding.
  • This tutorial focuses on using SWE-agent as a tool to solve individual issues. Benchmarking SWE-agent is covered separately. Finally, we have a different tutorial for using SWE-agent for coding challenges.

A few examples

Before we start with a more structured explanation of the command line options, here are a few examples that you might find immediately useful:

Fix a github issue
python run.py \
  --agent.model.name=gpt4 \
  --agent.model.per_instance_cost_limit=2.00 \  # (1)!
  --env.repo.github_url=https://github.com/SWE-agent/test-repo \
  --problem_statement.github_url=https://github.com/SWE-agent/test-repo/issues/1
Work on a github repo with a custom problem statement
python run.py \
  ...
  --env.repo.github_url=https://github.com/SWE-agent/test-repo \
  --problem_statement.text="Hey, can you fix all the bugs?"
Fix a bug in a local repository using a custom docker image
git clone https://github.com/SWE-agent/test-repo.git
python run.py \
  --agent.model.name=claude-3.5 \  # (1)!
  --env.repo.path=test-repo \
  --problem_statement.path=test-repo/problem_statements/1.md \
  --env.deployment.image=python:3.12  # (2)!
  1. Make sure to add anthropic keys to the environment for this one!
  2. This points to the dockerhub image of the same name

For the next example, we will use a cloud-based execution environment instead of using local docker containers. For this, you first need to set up a modal account, then run:

Deployment on modal (cloud-based execution)
python run.py \
  ...
  --env.deployment.type=modal \
  --env.deployment.image=python:3.12

All options

Run python run.py --help to see all available options for run.py. This tutorial will only cover a subset of options.

Configuration files

All configuration options can be specified either in one or more .yaml files, or as command line arguments. For example, our first command can be written as

python run.py --config my_run.yaml
my_run.yaml
agent:
  model:
    name: gpt4
    per_instance_cost_limit: 2.00
env:
  repo:
    github_url: https://github.com/SWE-agent/test-repo
problem_statement:
  github_url: https://github.com/SWE-agent/test-repo/issues/1

But we can also split it up into multiple files and additional command line options:

python run.py --config agent.yaml --config env.yaml \
    --problem_statement.text="Hey, can you fix all the bugs?"
agent.yaml
agent:
  model:
    name: gpt4
    per_instance_cost_limit: 2.00
env.yaml
env:
  repo:
    github_url: https://github.com/SWE-agent/test-repo

The default config file is config/default.yaml. Let's take a look at it:

Example: default config default.yaml
agent:
  templates:
    system_template: |-
      SETTING: You are an autonomous programmer, and you're working directly in the command line with a special interface.

      The special interface consists of a file editor that shows you {{WINDOW}} lines of a file at a time.
      In addition to typical bash commands, you can also use specific commands to help you navigate and edit files.
      To call a command, you need to invoke it with a function call/tool call.

      Please note that THE EDIT COMMAND REQUIRES PROPER INDENTATION.

      For example, if you are looking at this file:

      def fct():
          print("Hello world")

      and you want to edit the file to read:

      def fct():
          print("Hello")
          print("world")

      you search string should be `Hello world` and your replace string should be `"Hello"\n    print("world")`
      (note the extra spaces before the print statement!).

      You could also get the same result by search for `    print("Hello world")` and replace with `    print("Hello")\n    print("world")`.

      RESPONSE FORMAT:
      Your shell prompt is formatted as follows:
      (Open file: <path>)
      (Current directory: <cwd>)
      bash-$

      First, you should _always_ include a general thought about what you're going to do next.
      Then, for every response, you must include exactly _ONE_ tool call/function call.

      Remember, you should always include a _SINGLE_ tool call/function call and then wait for a response from the shell before continuing with more discussion and commands. Everything you include in the DISCUSSION section will be saved for future reference.
      If you'd like to issue two commands at once, PLEASE DO NOT DO THAT! Please instead first submit just the first tool call, and then after receiving a response you'll be able to issue the second .
      Note that the environment does NOT support interactive session commands (e.g. python, vim), so please do not invoke them.
    instance_template: |-
      We're currently solving the following issue within our repository. Here's the issue text:
      ISSUE:
      {{problem_statement}}

      INSTRUCTIONS:
      Now, you're going to solve this issue on your own. Your terminal session has started and you're in the repository's root directory. You can use any bash commands or the special interface to help you. Edit all the files you need to and run any checks or tests that you want.
      Remember, YOU SHOULD ALWAYS INCLUDE EXACTLY ONE TOOL CALL/FUNCTION CALL PER RESPONSE.
      When you're satisfied with all of the changes you've made, you can submit your changes to the code base by simply running the submit command.
      Note however that you cannot use any interactive session commands (e.g. python, vim) in this environment, but you can write scripts and run them. E.g. you can write a python script and then run it with the python command.

      NOTE ABOUT THE EDIT COMMAND: Indentation really matters! When editing a file, make sure to insert appropriate indentation before each line!

      IMPORTANT TIPS:
      1. Always start by trying to replicate the bug that the issues discusses.
        If the issue includes code for reproducing the bug, we recommend that you re-implement that in your environment, and run it to make sure you can reproduce the bug.
        Then start trying to fix it.
        When you think you've fixed the bug, re-run the bug reproduction script to make sure that the bug has indeed been fixed.

        If the bug reproduction script does not print anything when it successfully runs, we recommend adding a print("Script completed successfully, no errors.") command at the end of the file,
        so that you can be sure that the script indeed ran fine all the way through.

      2. If you run a command and it doesn't work, try running a different command. A command that did not work once will not work the second time unless you modify it!

      3. If you open a file and need to get to an area around a specific line that is not in the first 100 lines, say line 583, don't just use the scroll_down command multiple times. Instead, use the goto 583 command. It's much quicker.

      4. If the bug reproduction script requires inputting/reading a specific file, such as buggy-input.png, and you'd like to understand how to input that file, conduct a search in the existing repo code, to see whether someone else has already done that. Do this by running the command: find_file "buggy-input.png" If that doesn't work, use the linux 'find' command.

      5. Always make sure to look at the currently open file and the current working directory (which appears right after the currently open file). The currently open file might be in a different directory than the working directory! Note that some commands, such as 'create', open files, so they might change the current open file.

      6. When editing files, it is easy to accidentally to write code with incorrect indentation or make other mistakes. Always check the code after you issue an edit to make sure that it reflects what you wanted to accomplish. If it didn't, issue another command to fix it.

      7. It may be necessary to install the repository from source before you can run code. Please think about how to install the environment from the repository directory if you need to do so.

      (Open file: {{open_file}})
      (Current directory: {{working_dir}})
      bash-$
    next_step_template: |-
      {{observation}}
      (Open file: {{open_file}})
      (Current directory: {{working_dir}})
      bash-$
    next_step_no_output_template: |-
      Your command ran successfully and did not produce any output.
      (Open file: {{open_file}})
      (Current directory: {{working_dir}})
      bash-$
    demonstration_template: |
      Here is a demonstration of how to correctly accomplish this task.
      It is included to show you how to correctly use the interface.
      You do not need to follow exactly what is done in the demonstration.
      --- DEMONSTRATION ---
      {{demonstration}}
      --- END OF DEMONSTRATION ---
    demonstrations:
    - trajectories/demonstrations/replay__marshmallow-code__marshmallow-1867__function_calling_replace_from_source/marshmallow-code__marshmallow-1867.traj
    put_demos_in_history: true
  tools:
    env_variables:
      WINDOW: 100
      OVERLAP: 2
    bundles:
      - path: tools/registry
      - path: tools/defaults
      - path: tools/search
      - path: tools/edit_replace
      - path: tools/submit
    enable_bash_tool: true
    parse_function:
      type: function_calling
  history_processors:
    - type: last_n_observations
      n: 5

As you can see, this is where all the templates are defined!

This file is also loaded when no other --config options are specified. So to make sure that we get the default templates in the above examples with --config, we should have added

--config config/default.yaml

in addition to all the other --config options for the two examples above.

Problem statements and union types

Operating in batch mode: Running on SWE-bench and other benchmark sets

If you want to run SWE-agent in batch mode on SWE-bench or another whole evaluation set, see benchmarking. This tutorial focuses on using SWE-agent on individual issues.

We've already seen a few examples of how to specify the problem to solve, namely

--problem_statement.data_path /path/to/problem.md
--problem_statement.repo_path /path/to/repo
--problem_statement.text="..."

Each of these types of problems can have specific configuration options.

To understand how this works, we'll need to understand union types. Running sweagent run builds up a configuration object that essentially looks like this:

agent: AgentConfig
env: EnvironmentConfig
problem_statement: TextProblemStatement | GithubIssue | FileProblemStatement  # (1)!
  1. This is a union type, meaning that the problem statement can be one of the three types.

Each of these configuration objects has its own set of options:

So how do we know which configuration object to initialize? It's simple: Each of these types has a different set of required options (e.g., github_url is required for GithubIssue, but not for TextProblemStatement). SWE-agent will automatically select the correct configuration object based on the command line options you provide.

However, you can also explicitly specify the type of problem statement you want to use by adding a --problem_statement.type option.

Union type errors

If you ever ran a SWE-agent command and got a very long error message about various configuration options not working, it is because for union types. If everything works correctly, we try to initialize every option until we find the one that works based on your inputs (for example stopping at TextProblemStatement if you provided a --problem_statement.text). However, if none of them work, we throw an error which then tells you why we cannot initialize any of the types (so it will tell you that github_url is required for GithubIssue, even though you might not even have tried to work on a GitHub issue).

Example union type errors Example union type errors

This is the output of running

sweagent run --problem_statement.path="test" --problem_statement.github_url="asdf"

╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Validation error                                                                                                  │
│                                                                                                                   │
│ The following errors are raised by Pydantic, trying to instantiate the configuration based on                     │
│ the merged configuration dictionary (see above).                                                                  │
│                                                                                                                   │
│ Every new indented block corresponds to a different error from Pydantic.                                          │
│ The first line of each block is the attribute that failed validation, the following lines are the error messages. │
│                                                                                                                   │
│ If you see many lines of errors, there are probably different ways to instantiate the same object (a union type). │
│ For example, there are different deployments with different options each. Pydantic is then trying                 │
│ one after the other and reporting the failures for each of them.                                                  │
│                                                                                                                   │
│ 8 validation errors for RunSingleConfig                                                                           │
│ agent.model                                                                                                       │
│   Field required }, input_type=dict]                                                                              │
│     For further information visit https://errors.pydantic.dev/2.9/v/missing                                       │
│ problem_statement.TextProblemStatement.text                                                                       │
│   Field required                                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/missing                                       │
│ problem_statement.TextProblemStatement.path                                                                       │
│   Extra inputs are not permitted                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden                               │
│ problem_statement.TextProblemStatement.github_url                                                                 │
│   Extra inputs are not permitted                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden                               │
│ problem_statement.GithubIssue.path                                                                                │
│   Extra inputs are not permitted                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden                               │
│ problem_statement.EmptyProblemStatement.path                                                                      │
│   Extra inputs are not permitted                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden                               │
│ problem_statement.EmptyProblemStatement.github_url                                                                │
│   Extra inputs are not permitted                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden                               │
│ problem_statement.FileProblemStatement.github_url                                                                 │
│   Extra inputs are not permitted                                                                                  │
│     For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden                               │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

If you want to read more about how this works, check out the pydantic docs.

Specifying the repository

The repository can be specified in a few different ways:

--env.repo.github_url=https://github.com/SWE-agent/test-repo
--env.repo.path=/path/to/repo

Again, those are union types. See here for all the options:

Configuring the environment

We mainly recommend you to build a docker image with all the dependencies you need and then use that with --env.deployment.image. In addition, you can also execute additional commands before starting the agent with env.post_startup_commands, which takes a list of commands, e.g.,

sweagent run \
    --agent.model.name=claude-3-5-sonnet-20241022 \
    --env.post_startup_commands='["pip install flake8"]' \  # (1)!
    ...
  1. Note the list syntax that is passed as a string using single ticks '. This is particularly important for zsh where [, ] have special meaning.

Here's an example of a custom docker environment (it's also available in the repo as docker/tiny_test.Dockerfile):

tiny_test.Dockerfile
FROM python:3.11.10-bullseye  # (1)!

ARG DEBIAN_FRONTEND=noninteractive  # (2)!
ENV TZ=Etc/UTC  # (3)!

WORKDIR /

# SWE-ReX will always attempt to install its server into your docker container
# however, this takes a couple of seconds. If we already provide it in the image,
# this is much faster.
RUN pip install pipx
RUN pipx install swe-rex  # (4)!
RUN pipx ensurepath  # (5)!

RUN pip install flake8  # (6)!

SHELL ["/bin/bash", "-c"]
# This is where pipx installs things
ENV PATH="$PATH:/root/.local/bin/"  # (7)!
  1. This is the base image.
  2. This is to avoid any interactive prompts from the package manager.
  3. Again, this avoids interactive prompts
  4. SWE-ReX is our execution backend. We start a small server within the container, which receives commands from the agent and executes them.
  5. This ensures that the path where pipx installs things is in the $PATH variable.
  6. This is to install flake8, which is used by some of our edit tools.
  7. Unfortunately, step 5 sometimes still doesn't properly add the SWE-ReX server to the $PATH variable. So we do it here again.

Taking actions

  • You can use --actions.apply_patch_locally to have SWE-agent apply successful solution attempts to local files.
  • Alternatively, when running on a GitHub issue, you can have the agent automatically open a PR if the issue has been solved by supplying the --actions.open_pr flag. Please use this feature responsibly (on your own repositories or after careful consideration).

All action options

See RunSingleActionConfig for all action options.

Alternatively, you can always retrieve the patch that was generated by SWE-agent. Watch out for the following message in the log:

╭──────────────────────────── 🎉 Submission successful 🎉 ────────────────────────────╮
│ SWE-agent has produced a patch that it believes will solve the issue you submitted! │
│ Use the code snippet below to inspect or apply it!                                  │
╰─────────────────────────────────────────────────────────────────────────────────────╯

And follow the instructions below it:

 # The patch has been saved to your local filesystem at:
 PATCH_FILE_PATH='/Users/.../patches/05917d.patch'
 # Inspect it:
 cat "${PATCH_FILE_PATH}"
 # Apply it to a local repository:
 cd <your local repo root>
 git apply "${PATCH_FILE_PATH}"