Skip to content

MAINT: Rapid response Scenario#1622

Open
rlundeen2 wants to merge 23 commits intomicrosoft:mainfrom
rlundeen2:users/rlundeen/2026_04_16_rapid_response
Open

MAINT: Rapid response Scenario#1622
rlundeen2 wants to merge 23 commits intomicrosoft:mainfrom
rlundeen2:users/rlundeen/2026_04_16_rapid_response

Conversation

@rlundeen2
Copy link
Copy Markdown
Contributor

@rlundeen2 rlundeen2 commented Apr 16, 2026

This PR refactors ContentHarms into RapidResponse (with ContentHarms kept as a deprecated alias), and introduces the foundational infrastructure for a central technique registry.

Using this pattern, we can register scenario techniques centrally for scenarios to use and share.

  1. TagQuery — a composable, frozen boolean predicate (&, |, ~) for filtering tagged registry objects.
  2. AttackTechniqueSpec — a declarative data class describing one registrable technique (class, tags, optional adversarial auto-detection, optional extra_kwargs_builder callback). Allows us to use these for scenario strategies
  3. SCENARIO_TECHNIQUES catalog — a single Python list of the AttackTechniqueSpecs used by RapidResponse. But it will grow.
  4. register_from_specs() — bulk, idempotent registration into the singletonAttackTechniqueRegistry.
  5. build_factory_from_spec() — auto-detects adversarial support via inspect.signature so no manual flag is needed per technique.
  6. build_strategy_class_from_specs() — dynamically generates a ScenarioStrategy enum from specs + TagQuery aggregates (rather than hardcoding it).
  7. RapidResponse.get_attack_technique_factories() — triggers registration and returns all factories from the registry.

@rlundeen2 rlundeen2 changed the title MAINT Breaking: Rapid response Scenario MAINT: Rapid response Scenario Apr 16, 2026
*,
objective_target: PromptTarget,
attack_scoring_config: AttackScoringConfig,
attack_adversarial_config: AttackAdversarialConfig | None = None,
Copy link
Copy Markdown
Contributor Author

@rlundeen2 rlundeen2 Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming update unrelated to this PR because previous name was confusing (and new). Because the factory has these set already, but here we are overriding it

@fdubut
Copy link
Copy Markdown
Contributor

fdubut commented Apr 20, 2026

I know this is still WIP 😃 but flagging early that my gut feeling is that rapid response will be effectively the union of a bunch of more atomic scenarios instead of being one mega-scenario covering everything.

rlundeen2 and others added 11 commits April 20, 2026 14:45
- content_harms.py: keep thin alias (ours), discard main's full class
- rapid_response.py: update to new _scenario_strategies API from PR microsoft#1627
- test_content_harms.py: removed (replaced by test_rapid_response.py)
- test_rapid_response.py: update _scenario_composites -> _scenario_strategies

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add display_group field to AtomicAttack (defaults to atomic_attack_name)
- Add display_group_map and get_display_groups() to ScenarioResult
- Update console_printer to aggregate by display_group
- Rename _build_atomic_attack_name -> _build_display_group in Scenario base
- RapidResponse: unique compound atomic_attack_name per technique x dataset
- Update scenarios.instructions.md for _scenario_strategies and _build_display_group

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- register_scenario_techniques() always uses default adversarial target
- Custom adversarial targets flow through factory.create() overrides
- Remove _apply_display_groups helper (display_group_map now persisted)
- Persist display_group_map in ScenarioResultEntry for DB round-trips
- Add accepts_scorer_override field to TechniqueSpec (TAP=False)
- Replace 'tap' magic string check with registry.accepts_scorer_override()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…registry specs

- Add 'core' and 'default' tags to SCENARIO_TECHNIQUES entries
- Add build_strategy_class_from_specs() to AttackTechniqueRegistry
  that creates ScenarioStrategy subclasses from TechniqueSpec lists
- Delete static RapidResponseStrategy enum; generate dynamically
  in RapidResponse.get_strategy_class() with lazy caching
- Uses spec list (pure data), not mutable registry — no side effects
- Update airt/__init__.py and content_harms.py with __getattr__ for
  lazy resolution of dynamic strategy class
- Update all test references to use _strategy_class() helper

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Aligns with AttackTechniqueRegistry, AttackTechniqueFactory, AttackTechnique.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move factory.create() inside the dataset loop so each AtomicAttack gets
an independent attack_technique instance. Previously, a single instance
was shared across all datasets for a technique, which could cause state
leakage between concurrent attack executions.

Benchmark: factory.create() costs ~6.5ms each, so 28 calls (4 techniques
x 7 datasets) adds only ~180ms — negligible at current scale.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce TagQuery frozen dataclass with AND/OR/NOT predicates and
&, |, ~ operators for arbitrary boolean composition. This enables
queries like:

  TagQuery(include_all={'core'}) & TagQuery(include_any={'A', 'B'})

which matches items tagged both 'core' AND at least one of 'A'/'B'.

- New file: pyrit/registry/tag_query.py
- Update build_strategy_class_from_specs to use dict[str, TagQuery]
- Update rapid_response.py aggregate_tags to use TagQuery
- 17 unit tests for TagQuery
- Export from pyrit/registry/__init__.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move Callable, TagQuery, PromptChatTarget, ScenarioStrategy, TrueFalseScorer
  into TYPE_CHECKING blocks (TC001/TC003)
- Add Returns/Raises sections to docstrings (DOC201/DOC501)
- Add docstrings for public methods (D102)
- Make Taggable protocol read-only (fixes frozen dataclass compat)
- Add __post_init__ validation to TagQuery with tests
- Simplify _matches_leaf return (SIM103)
- Fix test lint: rename S to strat (N806), lambda to def (E731),
  lowercase test name (N802), fix import ordering (I001)
- Add type: ignore comments for dynamic enum construction (mypy)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 marked this pull request as ready for review April 21, 2026 18:32
Copy link
Copy Markdown
Contributor

@ValbuenaVC ValbuenaVC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR Looks good! Some comments and a rough suggestion for scenarios long-term below.

Suggestion: this PR touches on something I've noticed with scenarios for a while now. Not a blocking comment since it's out of scope but I think scenarios need something like a state machine to standardize the relationship between two important components: each timestep of a scenario and how to transition between steps.

Like Frederic said it's likelier that we'll have a ton of rapid response scenarios that are each semantically very different but are expected to work the same way. Sharing the scenario across them makes much more sense if we keep the same transition between steps (the "rapid" part), but change what the steps are (the "response" part). How we do this I think depends a lot on how we see scenarios changing over time but something like ScenarioStep and ScenarioStrategy seem like natural places to start. Maybe something like this (very, very rough) idea:

class FooStep(ScenarioStep):
    ...

class ContentHarmsStep(ScenarioStep):
    """ Each timestep of a scenario owns its valid inputs, outputs, and lifecycle """
    outputs = ["safety_violation", ...]
    inputs = [AttackTechniqueA, ...]
    ...
    
    # Granularity: attack-level or attack-step-level?
    def process_async(self, input: AttackTechnique) -> str:
        # Hand-wavy and wrong types, this would need much stronger contracts
        result = input.run_attack()
        match result:
            case self._has_safety_violation(result):
                return "safety_violation"


class RapidResponseStrategy(ScenarioStrategy):
    """ 
    Strategies now focus on defining a policy and valid states for the scenario
    overall. You can recycle steps and keep their transitions the same to support
    rapid response situations.
    """    
    
    state = StateEnum.UNINITIALIZED
    valid_step_types = [ContentHarmsStep, FooStep, BarStep]
    policy: {
        StateEnum.UNINITIALIZED: self._start_scenario,
        StateEnum.OPENING_PHASE: self._opening_phase,
        ...
    }    

    def step():
        while state != StateEnum.COMPLETE:
            result = policy[state]()
            state.update(result)  
    ...  

class MyCustomScenario(Scenario):
    ...

    # This would be inherited from scenario so the user can focus on tweaking the event loop
    # and inner state that's unique to the scenario rather than managing its lifecycle
    def run_async():
        self.strategy.event_loop() 

Comment thread pyrit/registry/tag_query.py
Comment thread pyrit/registry/tag_query.py
Comment thread pyrit/registry/tag_query.py Outdated
Comment thread pyrit/scenario/scenarios/airt/rapid_response.py Outdated
Comment thread pyrit/scenario/scenarios/airt/rapid_response.py Outdated
Comment thread pyrit/scenario/core/scenario_techniques.py Outdated
Comment thread pyrit/scenario/core/__init__.py Outdated
@rlundeen2
Copy link
Copy Markdown
Contributor Author

TL;DR Looks good! Some comments and a rough suggestion for scenarios long-term below.

Suggestion: this PR touches on something I've noticed with scenarios for a while now. Not a blocking comment since it's out of scope but I think scenarios need something like a state machine to standardize the relationship between two important components: each timestep of a scenario and how to transition between steps.

Like Frederic said it's likelier that we'll have a ton of rapid response scenarios that are each semantically very different but are expected to work the same way. Sharing the scenario across them makes much more sense if we keep the same transition between steps (the "rapid" part), but change what the steps are (the "response" part). How we do this I think depends a lot on how we see scenarios changing over time but something like ScenarioStep and ScenarioStrategy seem like natural places to start. Maybe something like this (very, very rough) idea:

class FooStep(ScenarioStep):
    ...

class ContentHarmsStep(ScenarioStep):
    """ Each timestep of a scenario owns its valid inputs, outputs, and lifecycle """
    outputs = ["safety_violation", ...]
    inputs = [AttackTechniqueA, ...]
    ...
    
    # Granularity: attack-level or attack-step-level?
    def process_async(self, input: AttackTechnique) -> str:
        # Hand-wavy and wrong types, this would need much stronger contracts
        result = input.run_attack()
        match result:
            case self._has_safety_violation(result):
                return "safety_violation"


class RapidResponseStrategy(ScenarioStrategy):
    """ 
    Strategies now focus on defining a policy and valid states for the scenario
    overall. You can recycle steps and keep their transitions the same to support
    rapid response situations.
    """    
    
    state = StateEnum.UNINITIALIZED
    valid_step_types = [ContentHarmsStep, FooStep, BarStep]
    policy: {
        StateEnum.UNINITIALIZED: self._start_scenario,
        StateEnum.OPENING_PHASE: self._opening_phase,
        ...
    }    

    def step():
        while state != StateEnum.COMPLETE:
            result = policy[state]()
            state.update(result)  
    ...  

class MyCustomScenario(Scenario):
    ...

    # This would be inherited from scenario so the user can focus on tweaking the event loop
    # and inner state that's unique to the scenario rather than managing its lifecycle
    def run_async():
        self.strategy.event_loop() 

Great comment!

After this PR, RapidResponse is essentially declarative' it only specifies techniques, datasets, and defaults. The execution lifecycle (factory resolution, technique × dataset loop, scorer overrides, resume/retry) is all inherited from the base class. All our existing scenarios could be written like this so it simplifies our existing stuff a bunch.

I think the state machine becomes compelling when we need conditional transitions - e.g., "broad sweep first, then focus multi-turn attacks on categories that showed weakness." But if we want a per-attack escalation, a simpler _on_attack_complete_async hook in the base class could handle "if this succeeded, probe deeper" without the state machine overhead. For full multi-phase orchestration where the entire scenario pivots based on aggregate results, the policy/state pattern you're sketching would be the right abstraction. But I think we'd get a ton of value just with the hook.

Worth revisiting when we hit a concrete use case that needs branching; right now all scenarios are flat execution.

Comment thread pyrit/scenario/scenarios/airt/content_harms.py Outdated
Comment thread pyrit/scenario/scenarios/airt/content_harms.py Outdated
Comment thread pyrit/scenario/core/scenario.py
Comment thread pyrit/registry/object_registries/attack_technique_registry.py
Comment thread doc/scanner/airt.py Outdated
extra_kwargs: Static extra keyword arguments forwarded to the attack
constructor. Must not contain ``attack_adversarial_config`` (use
``adversarial_chat`` instead).
accepts_scorer_override: Whether the technique accepts a scenario-level
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are extra_kwargs supposed to be attack specific (like tree_width)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, i think this AttackTechniqueSpec is a good idea but am confused with the input for it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a tough bridge because we're trying to declare something that is live (ty adversarial chat for making it complicated). But I tried to update the docs to be more clear

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a way to validate the kwargs are all present / valid ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a GREAT way, until it's actually being run.

I keep saying this, but there is this tension between being able to tell things (like the command line) the different techniques, and actually instantiating those techniques at run time.

But maybe we could have a unit test that iterates through them all at least. It isn't perfect but it can at least catch bad configurations.

I'll include that in this PR.

Comment thread pyrit/registry/object_registries/attack_technique_registry.py Outdated
Comment thread pyrit/registry/tag_query.py
Comment thread pyrit/registry/object_registries/attack_technique_registry.py
rlundeen2 and others added 3 commits April 22, 2026 09:01
…ix spellings

- Promote factory-based _get_atomic_attacks_async from RapidResponse to Scenario base class
- Remove redundant RapidResponse._get_attack_technique_factories override
- Update doc examples to follow RapidResponse pattern (no override needed)
- Fix British spellings (behaviour->behavior, recognised->recognized)
- Fix mypy errors with cast(TrueFalseScorer, ...)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…4_16_rapid_response

# Conflicts:
#	pyrit/scenario/scenarios/airt/content_harms.py
@@ -147,18 +148,20 @@ async def print_summary_async(self, result: ScenarioResult) -> None:

# Per-strategy breakdown
self._print_section_header("Per-Strategy Breakdown")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-Group Breakdown

Copy link
Copy Markdown
Contributor

@ValbuenaVC ValbuenaVC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, looks good!

ValueError: If a spec declares ``adversarial_chat_key`` but the key
is not found in ``TargetRegistry``.
"""
from pyrit.registry import TargetRegistry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why is this import here

Resolves the default adversarial target, bakes it into the specs that
require it, then registers the resulting factories.
"""
from pyrit.registry.object_registries.attack_technique_registry import AttackTechniqueRegistry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants