Skip to content

[trading-native][storage-core] Commit applier + JMT pipeline + AptosDB wiring#19919

Merged
grao1991 merged 1 commit into
mainfrom
native_position_storage_core
Jun 2, 2026
Merged

[trading-native][storage-core] Commit applier + JMT pipeline + AptosDB wiring#19919
grao1991 merged 1 commit into
mainfrom
native_position_storage_core

Conversation

@grao1991

@grao1991 grao1991 commented May 28, 2026

Copy link
Copy Markdown
Contributor

Adds the durable-commit applier, the two-stage JMT commit pipeline, and the AptosDB integration for native positions. Builds on the durable storage tier in PR #19883 and the shared primitives in PRs #19894 + #19908.

The in-memory scanner mirror (the NativeStateStore DashMap and its readers) is layered on top in a later commit; the durable + JMT side here stands on its own.

Commit applier

  • NativeStateCommitter::apply: per-shard SchemaBatch fan-out against position_db for every Position write, returning NativeMerkleLeafUpdates (MerkleLeafUpdate per write) for the JMT pipeline. Stale-index entries for prior versions are written into the same batch so the pruner can drop them later.

Commit pipeline

  • PositionBufferedState — alias of the shared BufferedState<L, S, P, X> generic with position's pipeline parameters and a no-op PositionExtras. Constructed via spawn_commit_pipeline.
  • PositionSnapshotCommitter driver (merklize_position): pulls PositionSnapshotToCommit payloads, runs the snapshot through ShardedJmtMerkleDb::merklize_snapshot, forwards the resulting PositionMerkleCommit downstream.
  • PositionMerkleBatchCommitter: pulls PositionMerkleCommits and persists per-shard JMT batches + top-levels batch to position_merkle_db.

State-shape primitives

Adds storage/storage-interface/src/state_store/sharded_jmt_state.rs with the JMT-sharded pipeline primitives that aren't main-state shared: LeafSlot<V> + impl LeafEntry for LeafSlot<V>, ShardedJmtState<Slot> (16 MapLayer chains + extend / make_delta / is_descendant_of), the inherent impl on StateAndSummary<ShardedJmtState<Slot: LeafEntry>> (new_empty, extend, make_delta, etc.), and pre_shard_jmt_updates.

Also adds two StateSummary constructors that only JMT-sharded pipelines without a hot companion need: new_global_only(version, smt) and new_empty_global_only().

In the aptos-db crate:

  • PositionSlot — type alias of LeafSlot<()> (value not carried in-slot; lives in position_db durably).
  • PositionStateStore — owner / coordinator, type alias of the shared PipelineStateStore<L, BS> (also added in this commit, in crate::common). Wraps the shared current_state mutex (readable by outside consumers) and the PositionBufferedState mutex (commit-path serialization).
  • PositionDb gains the commit / commit_single_shard / write_progress / find_prior_version methods that the committer uses.

AptosDB wiring

  • AptosDB gains optional handles for PositionDb + PositionMerkleDb + PositionStateStore. Accessed via native_position_handles / native_state_committer.
  • init_native_position: opens 16 sharded position DBs + unsharded merkle DB with production CF tuning. Spawns the async commit pipeline seeded with the empty-tree placeholder root. Idempotent (rejects second call). Logs the open.
  • commit_native_position (in aptosdb_writer.rs): rayon-spawned alongside main-state commit; drains output.write_set().native_position_iter() (the WriteSet sibling bucket from [types]) through NativeStateCommitter, feeding the resulting MerkleLeafUpdates into the position JMT pipeline.

Note

High Risk
Introduces a new ledger commit sub-path and dual-DB crash/truncation/replay logic for trading positions; mitigated while ENABLE_TRADING_NATIVE remains false.

Overview
Adds native trading position persistence and commit integration in aptos-db, wired like main state but on a separate position_db / position_merkle_db pair and WriteSet::native_position_iter().

Durable KV path: NativeStateCommitter::apply batches per-shard PositionValue writes and stale-index rows (with in-chunk prior-version chaining), then PositionDb::commit once per ledger chunk. PositionDb gains find_prior_version, progress metadata, and truncation helpers aligned with chain OverallCommitProgress.

In-memory + JMT path: New ShardedJmtState / LeafSlot types and StateSummary::new_global_only support a position-specific BufferedState pipeline (merklize_position → async PositionMerkleBatchCommitter). PipelineStateStore coordinates shared current_state vs commit-path buffered_state.

AptosDB: Optional PositionBundle on open when ENABLE_TRADING_NATIVE is true (currently false). init_native_position opens DBs, truncates ahead-of-chain data on restart, replays gap write sets, and seeds the pipeline. commit_native_position runs in parallel during pre-commit: applies native writes, extends SMT at checkpoints, commits KV, updates buffered state. Checkpoints now include position DBs.

Unit tests cover stale-index semantics, tombstones, truncation, and in-chunk overwrite chaining.

Reviewed by Cursor Bugbot for commit 7f3172c. Bugbot is set up for automated code reviews on this repo. Configure here.

@grao1991 grao1991 requested a review from lightmark as a code owner May 28, 2026 20:20
Comment thread storage/aptosdb/src/position_snapshot_committer.rs Outdated
Comment thread storage/aptosdb/src/native_state_store.rs Outdated
@grao1991 grao1991 force-pushed the native_position_shared_pipeline branch 3 times, most recently from 387c1f9 to d9994e0 Compare May 29, 2026 00:25
@grao1991 grao1991 force-pushed the native_position_storage_core branch from 77c3346 to d38edd8 Compare May 29, 2026 02:39
Comment thread storage/aptosdb/src/schema/db_metadata/mod.rs
@grao1991 grao1991 changed the title [trading-native][storage-core] In-memory store + commit applier + JMT committer [trading-native][storage-core] Commit applier + JMT pipeline May 29, 2026
@grao1991 grao1991 force-pushed the native_position_storage_core branch 2 times, most recently from 35d7ac3 to 1399952 Compare May 29, 2026 03:31
@grao1991 grao1991 changed the title [trading-native][storage-core] Commit applier + JMT pipeline [trading-native][storage-core] Commit applier + JMT pipeline + AptosDB wiring May 29, 2026
Comment thread storage/aptosdb/src/utils/truncation_helper.rs
@grao1991 grao1991 force-pushed the native_position_storage_core branch from 1399952 to f9d9a0e Compare May 29, 2026 04:43
@grao1991 grao1991 force-pushed the native_position_shared_pipeline branch from d9994e0 to 65cd867 Compare May 29, 2026 18:36
@grao1991 grao1991 force-pushed the native_position_storage_core branch 2 times, most recently from 10f7a2f to 7268704 Compare May 29, 2026 19:23
Comment thread storage/aptosdb/src/db/mod.rs Outdated
@grao1991 grao1991 force-pushed the native_position_shared_pipeline branch from 65cd867 to c6e532b Compare May 29, 2026 20:23
@grao1991 grao1991 force-pushed the native_position_storage_core branch from 7268704 to 3f1372c Compare May 29, 2026 20:23
Comment thread storage/aptosdb/src/utils/truncation_helper.rs Outdated
@grao1991 grao1991 force-pushed the native_position_storage_core branch from 3f1372c to 808aecf Compare May 29, 2026 21:27
Comment thread storage/aptosdb/src/position_buffered_state.rs
@grao1991 grao1991 force-pushed the native_position_storage_core branch 3 times, most recently from f50f56b to 9cdfed0 Compare May 29, 2026 22:11
}
chain_state = Some(new_latest);
}
version += 1;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-version DB commit writes all 16 shards unnecessarily

Medium Severity

NativeStateCommitter::apply is called once per transaction output inside the chunk loop, and each call invokes PositionDb::commit, which spawns 16 rayon tasks to write progress markers to ALL shards—even the 15+ shards that received no data for that version. For a chunk of N transactions, this produces N × 18 DB writes (16 shards + metadata + progress). Main-state batches the entire chunk into a single commit. The per-version commit IS required for find_prior_version correctness, but writing progress to all 16 empty shards each time is wasteful and could cause significant I/O amplification on the commit path.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f50f56b. Configure here.

batch.delete::<StalePositionValueIndexSchema>(&index)?;
}
Ok(())
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truncation scans entire shard instead of seeking efficiently

Low Severity

delete_position_value_and_index calls iter.seek_to_first() and scans every PositionValueSchema row in the shard, only deleting entries with version >= start_version. Because the key uses bit-inverted version (!version), rows aren't physically sorted by version across key hashes, making a targeted seek impossible with this schema. However, this full-table scan could be very slow for large position DBs on crash-recovery startup. The stale-index iteration below it is efficient, but the value scan is not.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f50f56b. Configure here.

Comment thread storage/aptosdb/src/position_metrics.rs Outdated
@grao1991 grao1991 force-pushed the native_position_storage_core branch from 9cdfed0 to 9b465c2 Compare May 29, 2026 22:31
@grao1991 grao1991 force-pushed the native_position_shared_pipeline branch from c6e532b to 8297de8 Compare May 29, 2026 22:31
Comment thread storage/aptosdb/src/db/aptosdb_writer.rs Outdated
Comment thread storage/aptosdb/src/db/aptosdb_native_position.rs Outdated
Comment thread storage/aptosdb/src/utils/truncation_helper.rs
@grao1991 grao1991 force-pushed the native_position_storage_core branch 6 times, most recently from 12b5c63 to 57290e6 Compare June 1, 2026 16:30
@grao1991 grao1991 requested review from wqfish and zekun000 June 1, 2026 17:06
@grao1991 grao1991 force-pushed the native_position_storage_core branch 2 times, most recently from ae1ef62 to 90a8639 Compare June 1, 2026 17:40
Comment thread storage/aptosdb/src/db/aptosdb_native_position.rs Outdated
Comment thread storage/aptosdb/src/utils/truncation_helper.rs
@grao1991 grao1991 force-pushed the native_position_storage_core branch 4 times, most recently from 255ccb6 to 60f285d Compare June 1, 2026 18:35
…B wiring

Adds the durable-commit applier, the two-stage JMT commit pipeline,
and the AptosDB integration for native positions. Builds on the
durable storage tier in `[storage-core-db]` and the shared primitives
in `[shared] LeafEntry trait + StateSummary Option` / `[shared] Lift
async-commit pipeline`.

The in-memory scanner mirror (the `NativeStateStore` DashMap and
its readers) is layered on top in a later `[in-memory-store]`
commit; the durable + JMT side here stands on its own.

Commit applier:
- `NativeStateCommitter::apply`: per-shard `SchemaBatch` fan-out
  against `position_db` for every Position write, returning
  `NativeMerkleLeafUpdates` (`MerkleLeafUpdate` per write) for
  the JMT pipeline. Stale-index entries for prior versions are
  written into the same batch so the pruner can drop them later.

Commit pipeline:
- `PositionBufferedState` — alias of the shared
  `BufferedState<L, S, P, X>` generic with position's pipeline
  parameters and a no-op `PositionExtras`. Constructed via
  `spawn_commit_pipeline`.
- `PositionSnapshotCommitter` driver (`merklize_position`): pulls
  `PositionSnapshotToCommit` payloads, runs the snapshot through
  `ShardedJmtMerkleDb::merklize_snapshot`, forwards the resulting
  `PositionMerkleCommit` downstream.
- `PositionMerkleBatchCommitter`: pulls `PositionMerkleCommit`s and
  persists per-shard JMT batches + top-levels batch to
  `position_merkle_db`.

State-shape primitives in `storage-interface`:
`storage/storage-interface/src/state_store/sharded_jmt_state.rs`
with the JMT-sharded pipeline primitives that aren't main-state
shared: `LeafSlot<V>` + `impl LeafEntry for LeafSlot<V>`,
`ShardedJmtState<Slot>` (16 `MapLayer` chains + `extend` /
`make_delta` / `is_descendant_of`), the inherent impl on
`StateAndSummary<ShardedJmtState<Slot: LeafEntry>>`
(`new_empty`, `extend`, `make_delta`, etc.), and
`pre_shard_jmt_updates`. Adds two `StateSummary` constructors
that only JMT-sharded pipelines without a hot companion need:
`new_global_only(version, smt)` and `new_empty_global_only()`.

In-aptosdb crate:
- `PositionSlot` — type alias of `LeafSlot<()>` (value not carried
  in-slot; lives in `position_db` durably).
- `PositionStateStore` — owner / coordinator, type alias of the
  shared `PipelineStateStore<L, BS>` (also added in this commit,
  in `crate::common`). Wraps the shared `current_state` mutex
  (readable by outside consumers) and the `PositionBufferedState`
  mutex (commit-path serialization).
- `PositionDb` gains the `commit` / `commit_single_shard` /
  `write_progress` / `find_prior_version` methods that the
  committer uses.

AptosDB wiring:
- `AptosDB` gains optional handles for `PositionDb` +
  `PositionMerkleDb` + `PositionStateStore`. Accessed by the rest
  of the codebase via `native_position_handles` /
  `native_state_committer`.
- `init_native_position`: opens 16 sharded position DBs +
  unsharded merkle DB with production CF tuning. Spawns the
  async commit pipeline seeded with the empty-tree placeholder
  root. Idempotent (rejects second call). Logs the open.
- `commit_native_position` (in `aptosdb_writer.rs`): rayon-spawned
  alongside main-state commit; drains
  `output.write_set().native_position_iter()` (the WriteSet
  sibling bucket from [types]) through `NativeStateCommitter`,
  feeding the resulting `MerkleLeafUpdate`s into the position
  JMT pipeline.

Cargo.toml deps + lib.rs mod exports for the new files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 7f3172c. Configure here.

});
});
}
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write amplification: every shard written on every commit

Medium Severity

PositionDb::commit calls commit_single_shard for all 16 shards on every chunk commit, even when batch_opt is None. Each call writes a PositionShardCommitProgress marker via unwrap_or_default(). This means 16 RocksDB writes per chunk purely for progress markers on shards with no actual position data, causing unnecessary write amplification on every block.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 7f3172c. Configure here.

@grao1991 grao1991 enabled auto-merge (squash) June 2, 2026 02:14
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

✅ Forge suite compat success on 519f9927274a9f0da56887013c25bae8fcfea048 ==> 7f3172caa3e4ae903b964a44a918976cc221a9ba

Compatibility test results for 519f9927274a9f0da56887013c25bae8fcfea048 ==> 7f3172caa3e4ae903b964a44a918976cc221a9ba (PR)
1. Check liveness of validators at old version: 519f9927274a9f0da56887013c25bae8fcfea048
compatibility::simple-validator-upgrade::liveness-check : committed: 15214.35 txn/s, latency: 2275.41 ms, (p50: 2400 ms, p70: 2500, p90: 2800 ms, p99: 3300 ms), latency samples: 500860
2. Upgrading first Validator to new version: 7f3172caa3e4ae903b964a44a918976cc221a9ba
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6268.98 txn/s, latency: 5375.98 ms, (p50: 5900 ms, p70: 6100, p90: 6200 ms, p99: 6400 ms), latency samples: 215480
3. Upgrading rest of first batch to new version: 7f3172caa3e4ae903b964a44a918976cc221a9ba
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6202.10 txn/s, latency: 5451.02 ms, (p50: 6000 ms, p70: 6100, p90: 6100 ms, p99: 6300 ms), latency samples: 216500
4. upgrading second batch to new version: 7f3172caa3e4ae903b964a44a918976cc221a9ba
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 9984.67 txn/s, latency: 3354.98 ms, (p50: 3600 ms, p70: 3700, p90: 3900 ms, p99: 4100 ms), latency samples: 326420
5. check swarm health
Compatibility test for 519f9927274a9f0da56887013c25bae8fcfea048 ==> 7f3172caa3e4ae903b964a44a918976cc221a9ba passed
Test Ok

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

✅ Forge suite realistic_env_max_load success on 7f3172caa3e4ae903b964a44a918976cc221a9ba

two traffics test: inner traffic : committed: 15903.69 txn/s, latency: 1075.69 ms, (p50: 1000 ms, p70: 1100, p90: 1200 ms, p99: 1600 ms), latency samples: 5939200
two traffics test : committed: 99.99 txn/s, latency: 803.58 ms, (p50: 800 ms, p70: 900, p90: 900 ms, p99: 1200 ms), latency samples: 1680
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 0.281, avg: 0.260", "ConsensusProposalToOrdered: max: 0.113, avg: 0.107", "ConsensusOrderedToCommit: max: 0.185, avg: 0.161", "ConsensusProposalToCommit: max: 0.284, avg: 0.268"]
Max non-epoch-change gap was: 2 rounds at version 30317 (avg 0.00) [limit 4], 1.12s no progress at version 30317 (avg 0.06s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.24s no progress at version 2733353 (avg 0.24s) [limit 16].
Test Ok

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

✅ Forge suite framework_upgrade success on 519f9927274a9f0da56887013c25bae8fcfea048 ==> 7f3172caa3e4ae903b964a44a918976cc221a9ba

Compatibility test results for 519f9927274a9f0da56887013c25bae8fcfea048 ==> 7f3172caa3e4ae903b964a44a918976cc221a9ba (PR)
Upgrade the nodes to version: 7f3172caa3e4ae903b964a44a918976cc221a9ba
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2446.73 txn/s, submitted: 2451.86 txn/s, failed submission: 5.13 txn/s, expired: 5.13 txn/s, latency: 1193.67 ms, (p50: 1200 ms, p70: 1200, p90: 1600 ms, p99: 2400 ms), latency samples: 219201
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2302.79 txn/s, submitted: 2310.24 txn/s, failed submission: 7.45 txn/s, expired: 7.45 txn/s, latency: 1214.15 ms, (p50: 1100 ms, p70: 1200, p90: 2000 ms, p99: 2500 ms), latency samples: 210300
5. check swarm health
Compatibility test for 519f9927274a9f0da56887013c25bae8fcfea048 ==> 7f3172caa3e4ae903b964a44a918976cc221a9ba passed
Upgrade the remaining nodes to version: 7f3172caa3e4ae903b964a44a918976cc221a9ba
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2360.30 txn/s, submitted: 2370.28 txn/s, failed submission: 9.98 txn/s, expired: 9.98 txn/s, latency: 1192.16 ms, (p50: 1200 ms, p70: 1200, p90: 1800 ms, p99: 2400 ms), latency samples: 212901
Test Ok

@grao1991 grao1991 merged commit e93ea2a into main Jun 2, 2026
128 checks passed
Comment on lines +88 to +90
// The native-position DB isn't open here; it attaches later
// via `AptosDB::init_native_position`, which re-binds the
// pruner via `LedgerPrunerManager::attach_native_pruners`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't it open here? The comment should say the reason instead of the fact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants