Skip to content

feat(symbolic): persist replayable counterexample artifacts#15217

Merged
mablr merged 22 commits into
masterfrom
mablr/symbolic_artifact_support
Jun 22, 2026
Merged

feat(symbolic): persist replayable counterexample artifacts#15217
mablr merged 22 commits into
masterfrom
mablr/symbolic_artifact_support

Conversation

@mablr

@mablr mablr commented Jun 18, 2026

Copy link
Copy Markdown
Member

Description

Closes OSS-349

Adds symbolic counterexample artifacts that can be replayed independently from the original symbolic run. When forge test --symbolic confirms a counterexample, Forge writes a schema-versioned JSON artifact under the project cache and surfaces its path in JSON output and non-JSON test output.

Artifacts can be replayed with:

forge test --replay-symbolic-artifact <path>

Replay supports both single-call symbolic counterexamples and invariant/stateful sequence counterexamples.

What changed

  • Persist confirmed symbolic counterexamples as foundry:symbolic.counterexample@v1 artifacts.
  • Add --replay-symbolic-artifact <PATH> to reconstruct the target contract/test/path and replay the stored calls.
  • Add normalized JSON references via counterexample_artifacts.
  • Surface artifact paths in human output as warnings.
  • Validate replay artifacts before execution, including schema header, empty calls, target/function matching, and single-call value constraints.
  • Make replay CI-friendly:
    • still-reproducing counterexample fails
    • fixed counterexample passes
    • stale or ambiguous target fails
  • Cover single-call, invariant sequence, replay skip, stale target, fixed bug, and schema behavior in tests.

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

@figtracer figtracer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed the replay paths and reproduced one config-drift edge case locally.

Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/result.rs Outdated
Comment thread crates/forge/src/runner.rs
Comment thread crates/forge/src/runner.rs
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/runner.rs Outdated
@mablr mablr marked this pull request as draft June 18, 2026 20:35
@mablr mablr marked this pull request as ready for review June 19, 2026 10:37
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/runner.rs
@mablr mablr requested a review from figtracer June 19, 2026 12:20
Comment thread crates/forge/src/cmd/test/mod.rs Outdated
@figtracer figtracer self-requested a review June 22, 2026 09:31
figtracer
figtracer previously approved these changes Jun 22, 2026

@grandizzy grandizzy left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check findings below

Comment thread crates/evm/evm/src/executors/invariant/shrink.rs Outdated
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/runner.rs
Comment thread crates/forge/src/runner.rs Outdated
Comment thread crates/forge/src/cmd/test/mod.rs
let filter = self.filter(&config)?;
let replay_symbolic_artifact = self.load_symbolic_artifact_replay()?;
if replay_symbolic_artifact.is_some() {
config.symbolic.enabled = true;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 non-blocking: sequence replay needs no solver, yet this enables symbolic mode globally for all replays. If enabling symbolic has side effects (solver validation, test discovery), prefer an explicit replay execution mode over flipping this flag.

/// Executes all the tests in the project.
///
/// See [`Self::compile_and_run`] for more details.
#[allow(clippy::too_many_arguments)]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 non-blocking: run_tests now needs #[allow(clippy::too_many_arguments)] (also at L1296/L1335). Since replay_symbolic_artifact already lives in TestRunnerConfig, consider a RunTestsOptions struct instead of threading another positional arg.

Comment thread crates/evm/evm/src/executors/invariant/mod.rs
Comment thread crates/forge/src/result.rs
"raw_args"
],
"properties": {
"warp": {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 non-blocking: the schema requires compact 0x hex for warp/roll/value, but the loader doesn't validate against the schema and Alloy's U256 deserializer may accept other forms — a schema-contract mismatch on input. Add a producer test that Some(U256::ZERO) serializes to "0x0" (schema allows 0x0, forbids leading-zero forms). Also note the external HTTPS $refs mean the offline test only checks ref existence, not real validation.

@grandizzy grandizzy dismissed their stale review June 22, 2026 11:30

Re-reviewed after latest commits (2d56330): blockers 1, 2, and 4 are fixed and blocker 3 is mitigated. Dismissing in favor of an approval.

@grandizzy grandizzy left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed after the latest commits. All four earlier blockers are resolved or mitigated:

  • Single-call replay semantics (a3ffefe): now gates on replay.status == Confirmed, rejects warp/roll, uses call_raw + is_raw_call_success to mirror the original symbolic path, decodes the live revert reason (falling back to the artifact reason only when empty), and routes EvmError::Skip to single_skip. ✅
  • check_sequence executed-call counts (96ea2e7): independent calls_executed counter replaces idx + 1/sequence.len(). ✅
  • Artifact filename identity (415d8b5): 16-byte keccak over the full contract\0value\0kind identity, with a collision test. ✅
  • Double replay_persisted_call_sequence: still executes twice, but the second pass now recomputes metrics for the (possibly changed) sequence returned by replay_error, so it's a minor efficiency follow-up rather than a correctness issue. ✅ (mitigated)

Most non-blocking items were also addressed: #[serde(deny_unknown_fields)] on the artifact structs, loader rejecting non-confirmed artifacts, a U256::ZERO compact-hex test, doc-comments scoping execute_tx/execute_tx_and_register_created to invariant replay, and the legacy singular counterexample_artifact field documented in favor of the plural.

Remaining nits are optional follow-ups: config.symbolic.enabled = true for sequence replay, the run_tests argument count, and true offline schema validation for the external $refs.

LGTM, nice work.

@figtracer figtracer requested review from figtracer and grandizzy June 22, 2026 13:42
@mablr mablr merged commit 6896e94 into master Jun 22, 2026
18 of 19 checks passed
@mablr mablr deleted the mablr/symbolic_artifact_support branch June 22, 2026 13:45
@github-project-automation github-project-automation Bot moved this to Done in Foundry Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants