|
| 1 | +<!-- |
| 2 | +SPDX-FileCopyrightText: © 2025 Phala Network <dstack@phala.network> |
| 3 | +SPDX-License-Identifier: Apache-2.0 |
| 4 | +--> |
| 5 | + |
| 6 | +# Formal Verification Plan — `kms/auth-eth` |
| 7 | + |
| 8 | +A layered approach: cheap static analysis → SMTChecker invariants → Halmos |
| 9 | +symbolic tests → (optional) Certora. Each layer is independently mergeable. |
| 10 | + |
| 11 | +The **formal specification** these layers verify against lives in |
| 12 | +[`specification.md`](./specification.md). The spec is normative — code |
| 13 | +and tests track the spec, not the other way around. |
| 14 | + |
| 15 | +Reference: <https://github.com/leonardoalt/ethereum_formal_verification_overview> |
| 16 | + |
| 17 | +## Targets |
| 18 | + |
| 19 | +The two contracts under verification are authorization gates for TEE workloads; |
| 20 | +correctness matters more than gas. |
| 21 | + |
| 22 | +- `contracts/DstackApp.sol` (209 LOC) — per-app boot authorization |
| 23 | +- `contracts/DstackKms.sol` (276 LOC) — KMS whitelist + factory |
| 24 | +- `contracts/IAppAuth.sol`, `contracts/IAppAuthBasicManagement.sol` — interfaces |
| 25 | + |
| 26 | +## Layer 0 — Slither (static analysis) |
| 27 | + |
| 28 | +Not formal verification but the standard first pass. Catches reentrancy, |
| 29 | +shadowing, uninitialized state, unsafe `delegatecall`, etc. |
| 30 | + |
| 31 | +- [ ] Install slither-analyzer (Python via pipx or as a uv tool) |
| 32 | +- [ ] Add `slither.config.json` at `kms/auth-eth/` excluding `lib/`, `out/`, |
| 33 | + `cache/`, `node_modules/`, `test/` (focus on production contracts) |
| 34 | +- [ ] Run baseline `slither contracts/` and triage findings into: |
| 35 | + - real → fix |
| 36 | + - false-positive / wontfix → suppress with `// slither-disable-next-line` |
| 37 | + and a justifying comment |
| 38 | +<!-- CI integration intentionally not added — see "What's intentionally not |
| 39 | +in scope" below. Slither runs locally before risky changes. --> |
| 40 | +- [x] Slither runs locally; no CI gate (intentional) |
| 41 | +- [ ] Update `kms/auth-eth/README.md` with one line on how to run Slither locally |
| 42 | + |
| 43 | +**Effort:** ~1h. **Owner:** _tbd_. |
| 44 | + |
| 45 | +## Layer 1 — SMTChecker (deferred) |
| 46 | + |
| 47 | +**Status: deferred.** Originally planned as the cheap free tier, but the |
| 48 | +practical setup cost is high enough that it's not worth doing before Halmos. |
| 49 | + |
| 50 | +What was tried (and why it fails to deliver value here): |
| 51 | + |
| 52 | +1. **CHC engine + `solvers = ["z3"]`:** solc's binary distributions on |
| 53 | + foundry's `svm` (Solidity version manager) are not compiled with z3 |
| 54 | + linkage. Result: every analysis emits `Warning (7649): CHC analysis was |
| 55 | + not possible since no Horn solver was found and enabled`. |
| 56 | +2. **CHC engine + `solvers = ["smtlib2"]`:** solc emits SMT-LIB2 queries |
| 57 | + but does not auto-spawn an external solver. Without a wrapper, nothing |
| 58 | + actually runs. |
| 59 | +3. **BMC engine:** runs, but on our OZ-upgradeable contracts emits only |
| 60 | + `Warning (5724): SMTChecker: N unsupported language feature(s)` — |
| 61 | + delegatecall, complex modifiers, and proxy patterns aren't in scope. |
| 62 | + Net useful findings: zero. |
| 63 | + |
| 64 | +To enable SMTChecker properly we'd need to either: |
| 65 | +- build solc from source with `USE_Z3=1`, ship the binary, point forge at it |
| 66 | + (via `solc = "/path/to/our/solc"` in `foundry.toml`); |
| 67 | +- or run solc in a container image like `ethereum/solc:0.8.24-z3` and wire |
| 68 | + it into CI. |
| 69 | + |
| 70 | +Either path is significant infra work. Halmos (Layer 2) covers everything |
| 71 | +SMTChecker would prove on our contracts (asserts, overflows, owner-gated |
| 72 | +mutations) while also handling the symbolic-input cases SMTChecker can't |
| 73 | +(TCB string compare, `isAppAllowed` decision table), using a solver that's |
| 74 | +already a Foundry-native install. |
| 75 | + |
| 76 | +Revisit Layer 1 only if we want defense-in-depth alongside Halmos. |
| 77 | + |
| 78 | +## Layer 2 — Halmos symbolic tests |
| 79 | + |
| 80 | +[Halmos](https://github.com/a16z/halmos) runs Foundry-style tests with all |
| 81 | +function arguments as symbolic variables. Sweet spot for our contracts: we |
| 82 | +reuse the existing `.t.sol` scaffolding and OZ-foundry-upgrades plugin. |
| 83 | + |
| 84 | +### Setup |
| 85 | + |
| 86 | +- [x] `pipx install halmos` documented in `README.md` |
| 87 | +- [ ] No CI integration (intentional — see "What's intentionally not in |
| 88 | + scope" below) |
| 89 | + |
| 90 | +### What we verify |
| 91 | + |
| 92 | +All tests are **single-call symbolic** (each `check_*` proves a property |
| 93 | +over symbolic inputs to one call). One test — |
| 94 | +`check_UpgradesDisabled_StepPreservation` — is the inductive *step* of a |
| 95 | +cross-transaction property (INV-1); see its entry and the note in |
| 96 | +"What we deliberately do not verify here" for the precise scope. |
| 97 | + |
| 98 | +We explicitly avoid the "symbolic clothing" failure mode where each |
| 99 | +owner-gated function gets its own `_OnlyOwner` test — those degenerate |
| 100 | +to a fuzz test of `OwnableUpgradeable.onlyOwner` (upstream-tested), |
| 101 | +and Halmos adds no information over bounded fuzzing. Where the spec |
| 102 | +says `pre: msg.sender == owner()`, we trust the OZ modifier. |
| 103 | + |
| 104 | +`test/DstackApp.symbolic.t.sol`: |
| 105 | +- [x] `check_DisableUpgrades_BlocksNextUpgrade` — after `disableUpgrades()`, |
| 106 | + the *next* upgrade attempt reverts for any caller / impl / init data. |
| 107 | +- [x] `check_UpgradesDisabled_StepPreservation` — INV-1 inductive step: |
| 108 | + from the canonical disabled state, no single call to any of the |
| 109 | + enumerated externally-callable mutating functions (symbolic |
| 110 | + selector + symbolic args + symbolic caller), issued against the |
| 111 | + *proxy*, flips `_upgradesDisabled` back to false. Validated by |
| 112 | + mutation testing: a permissionless flag-reset and an owner-callable |
| 113 | + flag-reset are both caught. Scope caveat below. |
| 114 | +- [x] `check_Initialize5Arg_DefaultsTcbToFalse` and `_HonorsTcbFlag` |
| 115 | + (6-arg) — initializer storage-layout coverage. Not symbolically |
| 116 | + stronger than bounded fuzz for the assertion they make; they're |
| 117 | + kept because they're cheap and would catch a slot-shift regression |
| 118 | + faster than a unit test would. |
| 119 | +- [x] `check_Initialize_OnceOnly` — after setUp's successful 5-arg |
| 120 | + init, both `initialize` overloads revert for any inputs. |
| 121 | + Verifies INV-3. |
| 122 | + |
| 123 | +Single-call (`test/DstackKms.symbolic.t.sol`): |
| 124 | +- [x] `check_RegisterApp_AnyCallerCanRegisterNonZeroAddress` + `_RejectsZeroAddress` |
| 125 | + — codifies the permissionless-by-design behavior; see "Findings" |
| 126 | +- [x] `check_IsAppAllowed_RejectsUnregisteredApp` / `_RejectsUnknownOsImage` |
| 127 | + — fully symbolic `AppBootInfo`; the failing gate produces the |
| 128 | + right rejection reason without delegating |
| 129 | +- [x] `check_IsAppAllowed_DelegatesFaithfully` — when both KMS gates |
| 130 | + pass, the outer return equals the registered `IAppAuth`'s return. |
| 131 | + Uses `MockConfigurableApp` whose `isAppAllowed` returns a |
| 132 | + symbolic boolean (both branches explored). Caveat: the mock has |
| 133 | + a single observable behavior shape; it does not universally |
| 134 | + quantify over all possible registered contracts. |
| 135 | +- [x] `check_IsAppAllowed_PropagatesMockRevert` — a reverting |
| 136 | + `MockRevertingApp` makes the outer `kms.isAppAllowed` revert, |
| 137 | + not return `(false, …)`. Spec §5.1. |
| 138 | +- [x] `check_IsKmsAllowed_RejectsUnknownMr` and `_RejectsUnknownDevice` |
| 139 | + — short-circuit gates for the `kmsAllowed*` whitelists |
| 140 | +- [x] `check_DeployAndRegisterApp_PostState` — when the 6-arg factory |
| 141 | + returns, all six post-conditions (registered, owner, |
| 142 | + allowAnyDevice, requireTcbUpToDate, and the conditional |
| 143 | + device/compose-hash branches per spec §3.3) hold simultaneously |
| 144 | +- [x] `check_DeployAndRegisterApp5Arg_DefaultsTcbToFalse` — same shape |
| 145 | + as above, plus `requireTcbUpToDate == false` |
| 146 | +- [x] `check_Owner_NotChangedByKmsFunctions` — INV-2 inductive step: |
| 147 | + no call to any of DstackKms's own mutating functions (symbolic |
| 148 | + selector + args + caller, against the proxy) changes `owner()`; |
| 149 | + the inherited OZ ownership functions are excluded (upstream- |
| 150 | + tested). Mutation-tested: catches owner-seizure in permissionless |
| 151 | + functions (incl. via the indirect `deployAndRegisterApp` → |
| 152 | + `registerApp` path) and owner-writes in owner-only functions. |
| 153 | + |
| 154 | +### What we deliberately do not verify here |
| 155 | + |
| 156 | +- **Owner-gated mutations.** `OwnableUpgradeable.onlyOwner` is upstream- |
| 157 | + tested; the spec records the precondition (§3.7, §3.11) and trusts it. |
| 158 | +- **TCB byte-exactness via symbolic strings.** Halmos models `keccak256` |
| 159 | + as an uninterpreted function, so a check of |
| 160 | + `allowed == (keccak(x) == keccak("UpToDate"))` against code that |
| 161 | + computes `allowed = (keccak(x) == keccak("UpToDate"))` is circular. |
| 162 | + The byte-exact-under-collision-resistance argument lives in the spec |
| 163 | + (§3.9) where it can be honest about being an assumption rather than |
| 164 | + a proof. |
| 165 | +- **Adversarial mock OOG / malformed-returndata paths.** OOG propagates |
| 166 | + to the outer call by EVM semantics. Misshapen returndata would |
| 167 | + trigger Solidity's strict ABI decoder revert; spec §6.5 calls this |
| 168 | + out as a useful next mock variant. |
| 169 | +- **Universal quantification over the registered `IAppAuth` contract.** |
| 170 | + Halmos instantiates one mock at a time; it does not quantify over |
| 171 | + arbitrary contracts. The two mock-driven tests bound the relevant |
| 172 | + shapes (faithful return; revert propagation), and the spec is |
| 173 | + explicit (§3.5) about treating the registered contract's output as |
| 174 | + untrusted downstream. |
| 175 | +- **Full cross-transaction monotonicity (INV-1) over arbitrary |
| 176 | + pre-states.** `check_UpgradesDisabled_StepPreservation` proves the |
| 177 | + inductive *step* only from the canonical disabled state, over the |
| 178 | + enumerated mutating surface. A complete proof would symbolically |
| 179 | + quantify the pre-state (Halmos 0.3.3 has no `--symbolic-storage`) |
| 180 | + and enumerate the surface programmatically rather than by hand. |
| 181 | + The residual risk is closed by source inspection — see spec §4 INV-1 |
| 182 | + (only two writers to the slot; the initializer path is closed by |
| 183 | + `check_Initialize_OnceOnly`). An earlier attempt to mechanize this |
| 184 | + with Halmos invariant-mode (`--invariant-depth`) was discarded: the |
| 185 | + auto-target fuzzer drove the implementation contract while the |
| 186 | + assertion read the proxy, so it passed vacuously (a flag-reset |
| 187 | + mutant was not caught). The step-test formulation reads and writes |
| 188 | + the same proxy storage and does catch such mutants. |
| 189 | +- **Full cross-transaction monotonicity (INV-2) over arbitrary |
| 190 | + pre-states.** Like INV-1, `check_Owner_NotChangedByKmsFunctions` |
| 191 | + proves the inductive step from the canonical post-init state, not |
| 192 | + over arbitrary pre-states. The two-step ownership *behaviour* itself |
| 193 | + (stage / accept / reject / re-target) is covered by unit tests in |
| 194 | + `DstackKms.t.sol` / `DstackApp.t.sol`. |
| 195 | +- **INV-6 (`__gap` zeroes).** Gap; would need the same step-test or |
| 196 | + symbolic-storage treatment. |
| 197 | + |
| 198 | +### Findings from the initial run |
| 199 | + |
| 200 | +- **`DstackKms.registerApp` is intentionally permissionless** (confirmed |
| 201 | + by the dstack team). The Halmos counterexample on an earlier |
| 202 | + `_OnlyOwner` test reflected design, not a bug: any non-zero address |
| 203 | + can be registered by anyone. Authorization is gated downstream by the |
| 204 | + owner-controlled `allowedOsImages` whitelist and the registered app's |
| 205 | + own `isAppAllowed`. The natspec on `registerApp` documents this; |
| 206 | + `check_RegisterApp_AnyCallerCanRegisterNonZeroAddress` codifies it. |
| 207 | + Crucially, `check_IsAppAllowed_DelegatesFaithfully` proves that even |
| 208 | + an adversarial registered contract is consulted only after the |
| 209 | + owner-controlled OS-image gate passes — registration alone confers |
| 210 | + no privilege. |
| 211 | + |
| 212 | +- **Two-step ownership transfer adopted.** Both contracts now inherit |
| 213 | + `Ownable2StepUpgradeable` instead of `OwnableUpgradeable`. The new |
| 214 | + storage uses ERC-7201 namespaced slots, so existing UUPS proxies can |
| 215 | + be safely upgraded to the new impl (no slot collision; the pending- |
| 216 | + owner slot is zero-initialized on first upgrade). `transferOwnership` |
| 217 | + no longer immediately transfers — the proposed owner must call |
| 218 | + `acceptOwnership` to complete the transfer, eliminating the typo-bricks- |
| 219 | + contract risk on the single-step variant. |
| 220 | + |
| 221 | +**Effort:** ~1 focused day. **Owner:** _tbd_. |
| 222 | + |
| 223 | +## Layer 3 — Certora (deferred) |
| 224 | + |
| 225 | +Deferred until a security review budget exists. CVL specs are 3-5× the size of |
| 226 | +Halmos tests, require team licenses, and overlap with Halmos coverage for |
| 227 | +authorization-style properties. |
| 228 | + |
| 229 | +If we revisit, target the same invariants as Layer 2 but add: |
| 230 | + |
| 231 | +- Storage-layout safety across upgrades (especially the |
| 232 | + `@custom:oz-renamed-from tproxyAppId` rename — Certora's storage diff is |
| 233 | + stronger than the OZ Foundry plugin's) |
| 234 | +- Cross-contract invariants spanning `DstackKms` ↔ `DstackApp` |
| 235 | + |
| 236 | +## What's intentionally not in scope |
| 237 | + |
| 238 | +- **CI gating of Slither / Halmos.** Symbolic execution is slow, solver- |
| 239 | + version-sensitive, and produces non-actionable noise as a PR gate. |
| 240 | + Run locally during development; treat these as design checks rather |
| 241 | + than blocking lints. |
| 242 | +- **Echidna** — overlaps with Foundry's built-in fuzzer (already runs 10k |
| 243 | + iterations under `FOUNDRY_PROFILE=ci` in `.github/workflows/foundry-test.yml`). |
| 244 | +- **Manticore / Mythril** — bytecode-level tools, slow, awkward with our |
| 245 | + forge artifacts. |
| 246 | +- **Scribble** — runtime assertions, redundant with our `assert` + Foundry |
| 247 | + test approach. |
| 248 | + |
| 249 | +## Status |
| 250 | + |
| 251 | +| Layer | Status | PR | |
| 252 | +|-------|--------|----| |
| 253 | +| 0. Slither | done (0 findings) | this branch | |
| 254 | +| 1. SMTChecker | deferred (see above) | — | |
| 255 | +| 2. Halmos symbolic (DstackApp) | done (5 properties, incl. INV-1 step) | this branch | |
| 256 | +| 2. Halmos symbolic (DstackKms) | done (11 properties, incl. INV-2 step) | this branch | |
| 257 | +| 3. Certora | deferred | — | |
| 258 | + |
| 259 | +Update this table as PRs land. |
0 commit comments