fix: multi-workspace correctness (desktop/API scoping, purge, fail-closed auth)#67
fix: multi-workspace correctness (desktop/API scoping, purge, fail-closed auth)#67zm2231 wants to merge 7 commits into
Conversation
Desktop sync and watch silently scoped to the config default workspace_id, so a multi-workspace user only ingested one signed-in workspace. Desktop now ignores the config default unless --workspace is explicit; API and read paths unchanged. Adds regression coverage in app_test.go.
sync --source all routed the config/default workspace into desktop ingest, and API sync collapsed multi-workspace fan-out when workspace_id was set or derived. Now only an explicit --workspace scopes desktop ingest or narrows API sync; otherwise API fans out across all configured workspaces and desktop ingests all.
…spaces for global Without --workspace, purge deleted across all workspaces and wrote a global retention floor. Purge now defaults to the configured workspace_id; global purge is explicit via --all-workspaces, which cannot be combined with --workspace.
…udes A name-based exclude caused channels whose names weren't resolved from Redux to be dropped. Unknown-name channels are now kept unless the channel ID itself matches an exclude selector.
API sync overwrote the authenticated team id with the requested workspace without verifying they match, so a global-fallback token could sync one workspace and label its rows as another during multi-workspace fan-out. Sync now errors when the requested workspace and authenticated team differ (bot and tail paths); fan-out reports the offending workspace and never mislabels others. Adds a hermetic fan-out integration test with an injected Slack API.
Workspace-scoped purge selected and deleted by (channel_id, ts) only, relying on that pair being globally unique across workspaces. Selection and deletion now carry workspace_id end to end; global purge and retention floors are unchanged. Documents the message-identity invariant in docs/retention.md.
|
Codex review: needs maintainer review before merge. Reviewed June 24, 2026, 4:22 PM ET / 20:22 UTC. Summary Reproducibility: yes. at source level: current main coalesces cfg.WorkspaceID into sync options and lets opts.WorkspaceID overwrite auth.TeamID. The contributor also posted redacted live before/after output, though this read-only review did not run live Slack Desktop/API reproduction. Review metrics: 3 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Merge the correctness fixes after maintainers explicitly accept or document the desktop scoping and fail-closed auth upgrade behavior while keeping purge archive-wide by default. Do we have a high-confidence way to reproduce the issue? Yes, at source level: current main coalesces cfg.WorkspaceID into sync options and lets opts.WorkspaceID overwrite auth.TeamID. The contributor also posted redacted live before/after output, though this read-only review did not run live Slack Desktop/API reproduction. Is this the best way to solve the issue? Mostly yes: the patch directly fixes the data-integrity paths and reverted the destructive purge default change. The remaining question is maintainer acceptance of the compatibility-sensitive default/fallback changes. AGENTS.md: not found in the target repository. Codex review notes: model internal, reasoning high; reviewed against 905ac75f8337. Label changesLabel justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
Reverts the purge default-scope change. Without --workspace, purge stays archive-wide as before; --workspace scopes to one workspace. The default-scope policy change is left for a separate maintainer-sponsored discussion. The store-level workspace_id key qualification is unchanged.
|
Thanks for the review. Addressed both blockers. Purge default scope (compatibility). You are right, that was a silent behavior change to a destructive command, and the upgrade policy is a maintainer call, not mine to bundle in here. I reverted it. Without Fail-closed auth. This one is intentional and I would like to keep it. The prior code overwrote the authenticated team ID with the requested workspace without checking they matched, so a global fallback token could write one workspace's messages under a different workspace's ID. A hard error is the right failure mode for an identity mismatch. During fan-out it fails only the offending target and names it, so the other workspaces still sync. Operator action when it fires: set the per-workspace token env ( Real behavior proof (redacted)Run against disposable DB copies, workspace IDs redacted as 1. Desktop ingest, before vs after. Same config, same desktop cache. 2. Fail-closed auth. Bot token authenticates to 3. Purge scope. Copy of the three-workspace DB. The hermetic tests stay in as regression cover, including
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. |
What this fixes
A set of multi-workspace bugs in slacrawl. The common cause: the CLI's default or derived
workspace_idquietly acts as a filter when it shouldn't. I hit this mirroring three signed-in Slack workspaces from the desktop source.Concrete repro. Config has
workspace_idset, three workspaces signed into Slack Desktop.sync --source desktoppulled in only the configured workspace and dropped the other two.storage/root-state.jsonlisted all three underworkspaces. The DB had one. Nothing printed.Root cause
runSyncpassescoalesce(*workspaceID, cfg.WorkspaceID)into the sync options, andcfg.WorkspaceIDgets auto-filled fromDefaultWorkspaceID()(internal/config/config.go:295-298,:350-361). That default then flows into the desktop ingest filter and the API target resolver, so a workspace you never asked to scope to ends up narrowing the sync. The same path let a global fallback token write one workspace's messages under a different workspace's ID during fan-out.The six fixes
Desktop scoping.
sync --source desktop,sync --source all, andwatchno longer inherit the config default. Only an explicit--workspacescopes desktop ingest. NewdesktopOptionsForSourceAllininternal/syncer/syncer.go, withWorkspaceSetthreaded from the CLI.Repro: config
workspace_id=A, B and C also signed in. Before, only A ingests. After, A, B, C all ingest.API fan-out. Without
--workspace, API sync now runs every configured[[workspaces]]entry, which is whatdocs/configuration.mdalready promised. A set or derivedworkspace_idused to collapse that down to one.Fail-closed auth. This is the dangerous one.
internal/slackapi/api.gooverwrote the authenticatedauth.TeamIDwith the requested workspace and never checked the two matched. Pair that with the global token fallback inresolveWorkspaceToken, and a fan-out over a workspace missing its token env authenticates as the global team and stores those messages under the requested workspace's ID. Sync now returns an error when the requested workspace and the authenticated team differ, on both the bot and tail paths. Fan-out names the workspace that failed and leaves the rest alone.Repro:
[[workspaces]]A and B, unset B's token env. Before, A's data lands labeled as B. After,sync workspace B: ...auth mismatch error and zero mislabeled rows.Purge keys. Scoped (
--workspace) selection and deletion now carryworkspace_idend to end (internal/store/purge.go) instead of leaning on(channel_id, ts)alone. The purge default is unchanged: without--workspace, purge stays archive-wide. (An earlier version of this PR also changed that default to the configured workspace; that change was reverted after review since the upgrade policy for a destructive command is a maintainer decision.)Desktop name-excludes. A channel whose name didn't resolve from Redux used to get dropped whenever any name-based exclude was set. Those stay in now. Explicit channel-ID matches still exclude.
Tests
One regression test per fix, all hermetic (no network, no writes to
~/.slacrawl):TestDesktopSyncDoesNotInheritDefaultWorkspaceFilter,TestDesktopOptionsForSourceAllClearsInheritedWorkspaceTestRunSyncTargetsUsesWorkspaceTokensAndRejectsMismatchedAuthinjects a fake Slack API and proves distinct tokens map to distinct workspaces and a mismatched token gets rejected with nothing mislabeledTestPurgeMessagesWorkspaceScopeKeepsOtherWorkspaceRowsTestIngestDesktopStateAllowsUnknownChannelsForNameExcludes,TestIngestReduxStatesAllowsUnknownChannelsForNameExcludesWithWorkspaceFilterOne thing I left alone
messagesis keyed by(channel_id, ts)andchannels/usersbyid, which assumes Slack IDs are globally unique across workspaces. That holds everywhere except Slack Connect channels shared between two of your own workspaces. I documented it indocs/retention.mdand kept the schema as is, since a key migration is a bigger call than this PR should make.