History processor configuration
History processors can filter the history/trajectory to query the model. For example, a very simple history processor would be one that strips away old observations to reduce context when querying the model.
You can set them as follows:
agent:
history_processors:
- type: last_n_observations
n: 5
sweagent.agent.history_processors.DefaultHistoryProcessor
pydantic-model
Bases: BaseModel
Config:
extra
:forbid
Fields:
-
type
(Literal['default']
)
type
pydantic-field
type: Literal['default'] = 'default'
Do not change. Used for (de)serialization.
sweagent.agent.history_processors.LastNObservations
pydantic-model
Bases: BaseModel
Elide all but the last n observations or remove tagged observations.
This is our most classic history processor, used in the original paper to elide but the last 5 observations. Elided observations are replaced by "Old environment output: (n lines omitted)".
Typical configuration:
agent:
history_processors:
- type: last_n_observations
n: 5
as for example in use in the SWE-agent 0.7 config at https://github.com/SWE-agent/SWE-agent/blob/main/config/sweagent_0_7/07.yaml
For most use cases, you only need to set n
.
Note that using this history processor will break prompt caching (as the
history of every query will change every time due to the elided observations).
There are some workarounds possible with the polling
parameter.
However, most SotA models can now fit a lot of context, so generally this history processor is not always needed anymore.
Config:
extra
:forbid
Fields:
-
n
(int
) -
polling
(int
) -
always_remove_output_for_tags
(set[str]
) -
always_keep_output_for_tags
(set[str]
) -
type
(Literal['last_n_observations']
)
Validators:
-
validate_n
→n
n
pydantic-field
n: int
Number of observations to keep.
polling
pydantic-field
polling: int = 1
How many steps to keep between updating the number of observations to keep.
This is useful for caching, as we want to remove more and more messages, but every
time we change the history, we need to cache everything again.
Effectively, we will now keep between n
and n+polling
observations.
always_remove_output_for_tags
pydantic-field
always_remove_output_for_tags: set[str] = {'remove_output'}
Any observation with a tags
field containing one of these strings will be elided,
even if it is one of the last n observations.
always_keep_output_for_tags
pydantic-field
always_keep_output_for_tags: set[str] = {'keep_output'}
Any observation with a tags
field containing one of these strings will be kept,
even if it is not one of the last n observations.
type
pydantic-field
type: Literal['last_n_observations'] = 'last_n_observations'
Do not change. Used for (de)serialization.
validate_n
pydantic-validator
validate_n(n: int) -> int
Source code in sweagent/agent/history_processors.py
132 133 134 135 136 137 |
|
sweagent.agent.history_processors.TagToolCallObservations
pydantic-model
Bases: BaseModel
Adds tags to history items for specific tool calls.
Config:
extra
:forbid
Fields:
-
type
(Literal['tag_tool_call_observations']
) -
tags
(set[str]
) -
function_names
(set[str]
)
type
pydantic-field
type: Literal['tag_tool_call_observations'] = 'tag_tool_call_observations'
Do not change. Used for (de)serialization.
tags
pydantic-field
tags: set[str] = {'keep_output'}
Add the following tag to all observations matching the search criteria.
function_names
pydantic-field
function_names: set[str]
Only consider observations made by tools with these names.
sweagent.agent.history_processors.CacheControlHistoryProcessor
pydantic-model
Bases: BaseModel
This history processor adds manual cache control marks to the history. Use this when running with anthropic claude.
Config:
extra
:forbid
Fields:
-
type
(Literal['cache_control']
) -
last_n_messages
(int
) -
last_n_messages_offset
(int
) -
tagged_roles
(list[str]
)
type
pydantic-field
type: Literal['cache_control'] = 'cache_control'
Do not change. Used for (de)serialization.
last_n_messages
pydantic-field
last_n_messages: int = 2
Add cache control to the last n user messages (and clear it for anything else). In most cases this should be set to 2 (caching for multi-turn conversations). When resampling and running concurrent instances, you want to set it to 1. If set to <= 0, any set cache control will be removed from all messages.
last_n_messages_offset
pydantic-field
last_n_messages_offset: int = 0
E.g., set to 1 to start cache control after the second to last user message. This can be useful in rare cases, when you want to modify the last message after we've got the completion and you want to avoid cache mismatch.
tagged_roles
pydantic-field
tagged_roles: list[str] = ['user', 'tool']
Only add cache control to messages with these roles.
sweagent.agent.history_processors.RemoveRegex
pydantic-model
Bases: BaseModel
This history processor can remove arbitrary content from history items
Config:
extra
:forbid
Fields:
remove
pydantic-field
remove: list[str] = ['<diff>.*</diff>']
Regex patterns to remove from history items
keep_last
pydantic-field
keep_last: int = 0
Keep the last n history items unchanged
type
pydantic-field
type: Literal['remove_regex'] = 'remove_regex'
Do not change. Used for (de)serialization.