Skip to content

Add SHA-1 sync compare mode#92

Open
goanpeca wants to merge 40 commits into
mainfrom
fix/29-sha1-compare-mode
Open

Add SHA-1 sync compare mode#92
goanpeca wants to merge 40 commits into
mainfrom
fix/29-sha1-compare-mode

Conversation

@goanpeca

@goanpeca goanpeca commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add sha1 to CompareMode and SyncPath.contentSha1 for content-hash comparison.
  • Prepare SHA-1 comparisons with bounded/chunked concurrency, size short-circuits, runtime compare-mode validation, sync concurrency normalization, and clearer aggregate sync errors.
  • Hash local files through a dedicated SHA-1 reader with abort support, a default idle/no-progress timeout, non-regular/symlink rejection, scanned-size bounds, sanitized per-file errors, and string | null reader contracts.
  • Treat B2 SHA-1 metadata conservatively: preserve explicit null, classify invalid strings as untrusted, treat multipart fileInfo.large_file_sha1 as untrusted, verify untrusted B2 bytes by selected fileId, and avoid treating forged or unverified metadata as successful equality.
  • Keep verified single-part B2 contentSha1 as equality material without full remote downloads; untrusted B2 verification uses a bounded idle/read timeout, surfaces per-file errors on stalls, and does not run during dry-runs.
  • Harden local-to-B2 uploads against scan-to-read races by verifying the scanned regular-file identity before upload, and download B2-to-local actions by selected file ID so compared bytes match written bytes.
  • Surface local/B2 scan failures through sync error events, honor aborts during scans between iterations/pages, pass B2 list abort signals through raw and bucket APIs, reject unsafe B2-to-local paths, avoid symlinked local destination parents, and prevent local scan errors from turning remote destination files into delete-mode orphans.
  • Bound queued transfer actions and pending action promises by sync concurrency instead of materializing the whole action set, expose SHA-1 local hashing as compare.bytesHashed, and expose B2 verification downloads as compare.bytesVerified while keeping compare.size stable at 0.
  • Centralize SHA-1 normalization across download/sync callers and document SHA-1 action preconditions, dry-run and hashing costs, public contentSha1 states, scan diagnostics, and upload retry risks.
  • Harden fresh-upload-URL retry behavior so lost 2xx upload response bodies are not retried by default unless callers opt in with retryResponseBodyFailures: true.
  • Multipart resume now paginates unfinished large-file discovery, reuses unfinished large-file IDs without trusting server part SHA-1 values by default, and retries opted-in lost 2xx upload responses when JSON parsing fails; callers must set trustServerPartSha1s: true for trusted-writer buckets to skip matching parts.
  • Validate SYNC_MODE and SYNC_CONCURRENCY in the example CLI and avoid real-example cleanup races by skipping fresh sdk-examples-* buckets from overlapping CI jobs.
  • Add regression coverage for SHA-1 preparation, B2 metadata forgery, selected-version verification, stalled B2 bodies, scanner failures/aborts, bounded transfers, local hashing failures, FIFO/symlink handling, upload retry defaults, and the restored coverage gate.
  • Resolve audited dev dependency advisories by overriding ws to 8.21.0 and markdown-it to 14.2.0 with exact lockfile pins.

Linked issue

Closes #29

Tests run

  • pnpm exec vitest run src/sync/sync.test.ts src/sync/synchronizer.test.ts src/sync/scanners/scanners.node.test.ts src/sync/compare.node.test.ts src/sync/local-sha1.node.test.ts
  • pnpm test
  • pnpm test:coverage
  • pnpm lint
  • pnpm lint:docs
  • pnpm lint:spelling
  • pnpm typecheck
  • pnpm typecheck:examples
  • pnpm build
  • pnpm run docs
  • pnpm run verify:metadata
  • pnpm run verify:exports
  • pnpm audit --prod=false
  • gh pr checks 92 --repo backblaze-labs/b2-sdk-typescript

Follow-up notes

  • The multipart resume trust change remains in this PR because this review cycle addressed trust in B2-reported SHA-1 metadata across sync and upload-resume paths; keeping both changes together avoids shipping one SHA-1 trust-boundary fix without the other.
  • SHA-1 mode is documented as accidental drift detection, not a cryptographic tamper guarantee.
  • B2 SHA-1 metadata that could be forged or unverified no longer proves equality by itself.
  • SHA-1 dry-runs still hash matching-size local files, and changed uploads can read local bytes once for hashing and again for transfer.
  • Same-size files with no comparable remote SHA-1 are skipped in sha1 mode with a surfaced event; use modtime/size mode or ensure remote objects carry verifiable digests when that tradeoff is not acceptable.

Copilot AI review requested due to automatic review settings June 19, 2026 00:07
@goanpeca goanpeca added this to the v0.2.0 milestone Jun 19, 2026
@goanpeca goanpeca added enhancement New feature or request area: sync Area: sync priority: high High severity / do first labels Jun 19, 2026
@goanpeca goanpeca self-assigned this Jun 19, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new sha1 compare mode to the sync engine so it can detect same-size/same-mtime content drift by comparing (normalized) SHA-1 values, hashing local files when needed. This extends the existing sync policy surface (CompareMode) and updates scanners, tests, and example documentation accordingly.

Changes:

  • Extend CompareMode with 'sha1' and propagate SHA-1 metadata through SyncPath/B2 scanning.
  • Implement SHA-1-based comparison in filesAreDifferent, and hash local files during paired comparisons in synchronize.
  • Add regression tests for SHA-1 mode and update sync example docs/env var descriptions.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/sync/types.ts Adds sha1 to CompareMode and introduces optional contentSha1 on SyncPath.
src/sync/synchronizer.ts Hashes local files for paired comparisons when compareMode === 'sha1'.
src/sync/synchronizer.test.ts Adds integration coverage for sha1-mode upload/skip behavior.
src/sync/sync.test.ts Adds unit coverage for SHA-1 compare semantics (match/diff/unavailable).
src/sync/scanners/b2.ts Includes contentSha1 in scanned B2SyncPath entries.
src/sync/policies/compare.ts Implements SHA-1 comparison with normalization/validation.
examples/README.md Documents sha1 as a supported sync compare mode.
examples/node-sync-cli.ts Updates usage comment to include sha1 mode.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/sync/policies/compare.ts Outdated
Comment thread src/sync/synchronizer.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Comment thread src/sync/types.ts
Comment thread src/sync/policies/compare.ts Outdated
Comment thread src/download/checksum.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread src/sync/policies/compare.ts

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread src/sync/synchronizer.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread src/sync/synchronizer.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

@goanpeca goanpeca marked this pull request as ready for review June 19, 2026 01:31

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 32 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread src/sync/scanners/local.ts

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 32 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread src/upload/large.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated no new comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Copilot AI review requested due to automatic review settings June 20, 2026 15:05

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread src/sync/types.ts
Copilot AI review requested due to automatic review settings June 20, 2026 15:17

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread src/sync/scanners/b2.ts
Copilot AI review requested due to automatic review settings June 20, 2026 15:23

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread CHANGELOG.md

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • pnpm-lock.yaml: Generated file

Comment thread src/sync/policies/compare.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: sync Area: sync enhancement New feature or request priority: high High severity / do first

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sync: no sha1 / content compare mode

2 participants