Skip to content

fix: multi-workspace correctness (desktop/API scoping, purge, fail-closed auth)#67

Open
zm2231 wants to merge 7 commits into
openclaw:mainfrom
zm2231:fix/desktop-workspace-filter
Open

fix: multi-workspace correctness (desktop/API scoping, purge, fail-closed auth)#67
zm2231 wants to merge 7 commits into
openclaw:mainfrom
zm2231:fix/desktop-workspace-filter

Conversation

@zm2231

@zm2231 zm2231 commented Jun 23, 2026

Copy link
Copy Markdown

What this fixes

A set of multi-workspace bugs in slacrawl. The common cause: the CLI's default or derived workspace_id quietly acts as a filter when it shouldn't. I hit this mirroring three signed-in Slack workspaces from the desktop source.

Concrete repro. Config has workspace_id set, three workspaces signed into Slack Desktop. sync --source desktop pulled in only the configured workspace and dropped the other two. storage/root-state.json listed all three under workspaces. The DB had one. Nothing printed.

Root cause

runSync passes coalesce(*workspaceID, cfg.WorkspaceID) into the sync options, and cfg.WorkspaceID gets auto-filled from DefaultWorkspaceID() (internal/config/config.go:295-298, :350-361). That default then flows into the desktop ingest filter and the API target resolver, so a workspace you never asked to scope to ends up narrowing the sync. The same path let a global fallback token write one workspace's messages under a different workspace's ID during fan-out.

The six fixes

  1. Desktop scoping. sync --source desktop, sync --source all, and watch no longer inherit the config default. Only an explicit --workspace scopes desktop ingest. New desktopOptionsForSourceAll in internal/syncer/syncer.go, with WorkspaceSet threaded from the CLI.
    Repro: config workspace_id=A, B and C also signed in. Before, only A ingests. After, A, B, C all ingest.

  2. API fan-out. Without --workspace, API sync now runs every configured [[workspaces]] entry, which is what docs/configuration.md already promised. A set or derived workspace_id used to collapse that down to one.

  3. Fail-closed auth. This is the dangerous one. internal/slackapi/api.go overwrote the authenticated auth.TeamID with the requested workspace and never checked the two matched. Pair that with the global token fallback in resolveWorkspaceToken, and a fan-out over a workspace missing its token env authenticates as the global team and stores those messages under the requested workspace's ID. Sync now returns an error when the requested workspace and the authenticated team differ, on both the bot and tail paths. Fan-out names the workspace that failed and leaves the rest alone.
    Repro: [[workspaces]] A and B, unset B's token env. Before, A's data lands labeled as B. After, sync workspace B: ... auth mismatch error and zero mislabeled rows.

  4. Purge keys. Scoped (--workspace) selection and deletion now carry workspace_id end to end (internal/store/purge.go) instead of leaning on (channel_id, ts) alone. The purge default is unchanged: without --workspace, purge stays archive-wide. (An earlier version of this PR also changed that default to the configured workspace; that change was reverted after review since the upgrade policy for a destructive command is a maintainer decision.)

  5. Desktop name-excludes. A channel whose name didn't resolve from Redux used to get dropped whenever any name-based exclude was set. Those stay in now. Explicit channel-ID matches still exclude.

Tests

One regression test per fix, all hermetic (no network, no writes to ~/.slacrawl):

  • TestDesktopSyncDoesNotInheritDefaultWorkspaceFilter, TestDesktopOptionsForSourceAllClearsInheritedWorkspace
  • TestRunSyncTargetsUsesWorkspaceTokensAndRejectsMismatchedAuth injects a fake Slack API and proves distinct tokens map to distinct workspaces and a mismatched token gets rejected with nothing mislabeled
  • TestPurgeMessagesWorkspaceScopeKeepsOtherWorkspaceRows
  • TestIngestDesktopStateAllowsUnknownChannelsForNameExcludes, TestIngestReduxStatesAllowsUnknownChannelsForNameExcludesWithWorkspaceFilter
go test ./...        # Go 1.26.4: 13 packages ok, 0 failures
gofmt -l internal/   # empty
go vet ./...         # clean

One thing I left alone

messages is keyed by (channel_id, ts) and channels/users by id, which assumes Slack IDs are globally unique across workspaces. That holds everywhere except Slack Connect channels shared between two of your own workspaces. I documented it in docs/retention.md and kept the schema as is, since a key migration is a bigger call than this PR should make.

zm2231 added 6 commits June 22, 2026 21:57
Desktop sync and watch silently scoped to the config default workspace_id, so a
multi-workspace user only ingested one signed-in workspace. Desktop now ignores
the config default unless --workspace is explicit; API and read paths unchanged.
Adds regression coverage in app_test.go.
sync --source all routed the config/default workspace into desktop ingest, and
API sync collapsed multi-workspace fan-out when workspace_id was set or derived.
Now only an explicit --workspace scopes desktop ingest or narrows API sync;
otherwise API fans out across all configured workspaces and desktop ingests all.
…spaces for global

Without --workspace, purge deleted across all workspaces and wrote a global
retention floor. Purge now defaults to the configured workspace_id; global purge
is explicit via --all-workspaces, which cannot be combined with --workspace.
…udes

A name-based exclude caused channels whose names weren't resolved from Redux to
be dropped. Unknown-name channels are now kept unless the channel ID itself
matches an exclude selector.
API sync overwrote the authenticated team id with the requested workspace without
verifying they match, so a global-fallback token could sync one workspace and
label its rows as another during multi-workspace fan-out. Sync now errors when the
requested workspace and authenticated team differ (bot and tail paths); fan-out
reports the offending workspace and never mislabels others. Adds a hermetic
fan-out integration test with an injected Slack API.
Workspace-scoped purge selected and deleted by (channel_id, ts) only, relying on
that pair being globally unique across workspaces. Selection and deletion now
carry workspace_id end to end; global purge and retention floors are unchanged.
Documents the message-identity invariant in docs/retention.md.
@clawsweeper

clawsweeper Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codex review: needs maintainer review before merge. Reviewed June 24, 2026, 4:22 PM ET / 20:22 UTC.

Summary
The branch changes CLI/syncer workspace scoping, Slack API workspace identity validation, purge key qualification, desktop name-exclude handling, retention docs, and regression tests.

Reproducibility: yes. at source level: current main coalesces cfg.WorkspaceID into sync options and lets opts.WorkspaceID overwrite auth.TeamID. The contributor also posted redacted live before/after output, though this read-only review did not run live Slack Desktop/API reproduction.

Review metrics: 3 noteworthy metrics.

  • Changed surface: 11 files, +412/-58. The PR spans CLI behavior, Slack API auth, desktop ingest, purge storage/docs, and tests, so maintainer review should cover multiple runtime paths.
  • Real proof paths: 3 live-output scenarios. The contributor supplied after-fix evidence for desktop ingest, auth mismatch rejection, and purge behavior.
  • Compatibility-sensitive behaviors: 2 defaults/fallbacks changed. Desktop scoping and Slack token/team mismatch handling can affect existing operator workflows after upgrade.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • none.

Mantis proof suggestion
An independent Slack Desktop smoke would materially help verify the multi-workspace desktop ingest path in a real profile. A maintainer can ask Mantis to capture proof by posting this exact PR comment:

@openclaw-mantis slack desktop smoke: verify multi-workspace desktop sync ingests all signed-in workspaces without --workspace and honors explicit --workspace filtering.

Risk before merge

  • [P1] Desktop sync/watch no longer inherit cfg.WorkspaceID, so users relying on the config default to limit desktop ingestion may import additional signed-in workspaces unless they pass --workspace.
  • [P1] Slack API sync/tail now fail when the requested workspace and authenticated token team differ, so setups relying on a global fallback token need per-workspace token configuration or maintainer acceptance of the fail-closed behavior.
  • [P1] The PR intentionally leaves the existing channel_id/ts message identity schema unchanged, so Slack Connect collisions across a user's own workspaces remain outside this fix.

Maintainer options:

  1. Accept and document the upgrade behavior (recommended)
    Maintainers can land the PR as-is after explicitly accepting that desktop sync now requires --workspace for configured-workspace-only ingest and auth mismatches now fail closed.
  2. Preserve compatibility with an opt-in strict mode
    If fail-closed auth is too disruptive as a default, ask for a compatibility-preserving default plus an explicit strict mode or documented migration path before merge.
  3. Split policy-sensitive runtime changes
    If the desktop default or token fallback policy needs broader discussion, split the purge key qualification and lower-risk tests/docs from the runtime behavior changes.

Next step before merge

  • [P1] Maintainers need to explicitly accept the compatibility-sensitive desktop scoping and fail-closed auth upgrade behavior before merge; there is no narrow automated repair blocker left.

Security
Cleared: The diff tightens Slack TeamID validation and does not change dependencies, workflows, secret permissions, or external code execution.

Review details

Best possible solution:

Merge the correctness fixes after maintainers explicitly accept or document the desktop scoping and fail-closed auth upgrade behavior while keeping purge archive-wide by default.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level: current main coalesces cfg.WorkspaceID into sync options and lets opts.WorkspaceID overwrite auth.TeamID. The contributor also posted redacted live before/after output, though this read-only review did not run live Slack Desktop/API reproduction.

Is this the best way to solve the issue?

Mostly yes: the patch directly fixes the data-integrity paths and reverted the destructive purge default change. The remaining question is maintainer acceptance of the compatibility-sensitive default/fallback changes.

AGENTS.md: not found in the target repository.

Codex review notes: model internal, reasoning high; reviewed against 905ac75f8337.

Label changes

Label justifications:

  • P1: The PR targets silent multi-workspace omission and possible workspace mislabeling in core Slack archive sync paths.
  • merge-risk: 🚨 compatibility: Merging changes desktop default scoping and may require users to pass --workspace to preserve configured-workspace-only desktop ingest.
  • merge-risk: 🚨 auth-provider: Merging changes Slack token fallback behavior by rejecting requested/authenticated workspace mismatches instead of accepting a global fallback token.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The contributor supplied redacted copied live output for the changed desktop, auth mismatch, and purge behavior after the fix.
  • proof: sufficient: Contributor real behavior proof is sufficient. The contributor supplied redacted copied live output for the changed desktop, auth mismatch, and purge behavior after the fix.
Evidence reviewed

What I checked:

  • Current main sync scoping: Current main builds sync options with coalesce(*workspaceID, cfg.WorkspaceID), so a configured or derived workspace can scope sync even when --workspace was not explicitly passed. (internal/cli/app.go:543, 905ac75f8337)
  • Documented API fan-out contract: The configuration docs say sync --source bot without --workspace runs against every configured [[workspaces]] entry, supporting the API fan-out correction. (docs/configuration.md:103, 905ac75f8337)
  • Current main auth overwrite path: Current main lets opts.WorkspaceID replace auth.TeamID before storing workspace data, matching the reported workspace mislabeling path. (internal/slackapi/api.go:174, 905ac75f8337)
  • PR workspace-scoping implementation: The PR tracks whether --workspace was explicitly set, avoids inherited desktop scoping, and adds source=all fan-out handling before desktop ingest. (internal/cli/app.go:539, a319d4aa0555)
  • PR auth implementation: The PR adds authenticatedWorkspaceID checks for Sync and Tail so requested workspace IDs must match the authenticated Slack team. (internal/slackapi/api.go:171, a319d4aa0555)
  • Purge default preserved: The final PR head documents that purge remains archive-wide without --workspace after the contributor reverted the earlier default-scope change. (docs/retention.md:31, a319d4aa0555)

Likely related people:

  • vincentkoc: Commit 360ad4c added ergonomic multi-workspace sync/tail support and documentation across the CLI, config, syncer, and docs surfaces involved here. (role: multi-workspace feature introducer; confidence: high; commits: 360ad4c23b39, 905ac75f8337; files: internal/cli/app.go, internal/config/config.go, internal/syncer/syncer.go)
  • steipete: GitHub commit history shows recent retention purge, desktop filter, and Slack API work in the central files, and local blame attributes the current touched lines to the v0.7.3 release snapshot. (role: purge, desktop, and Slack API area contributor; confidence: high; commits: f5f549466731, 3bea752593c9, 37337884a9f6; files: internal/store/purge.go, internal/cli/purge.go, internal/slackdesktop/desktop.go)
  • clawSean: Commit 68c8b94 added sync.exclude_channels and --exclude-channels handling adjacent to the desktop name-exclude and sync option behavior touched by this PR. (role: adjacent sync option contributor; confidence: medium; commits: 68c8b9408077; files: internal/cli/app.go, internal/config/config.go, internal/slackapi/api.go)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 Urgent regression or broken agent/channel workflow affecting real users now. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. labels Jun 23, 2026
Reverts the purge default-scope change. Without --workspace, purge stays
archive-wide as before; --workspace scopes to one workspace. The default-scope
policy change is left for a separate maintainer-sponsored discussion. The
store-level workspace_id key qualification is unchanged.
@zm2231

zm2231 commented Jun 23, 2026

Copy link
Copy Markdown
Author

Thanks for the review. Addressed both blockers.

Purge default scope (compatibility). You are right, that was a silent behavior change to a destructive command, and the upgrade policy is a maintainer call, not mine to bundle in here. I reverted it. Without --workspace, purge stays archive-wide exactly as before. --workspace scopes to one workspace. The --all-workspaces flag and the config-default behavior are gone (commit a319d4a). The only purge change left in this PR is internal: scoped selection and deletion now carry workspace_id through the temp key set, which is a pure correctness fix with no surface change. If the default-scope idea is worth pursuing, I will open it as a separate PR where the upgrade path can be decided.

Fail-closed auth. This one is intentional and I would like to keep it. The prior code overwrote the authenticated team ID with the requested workspace without checking they matched, so a global fallback token could write one workspace's messages under a different workspace's ID. A hard error is the right failure mode for an identity mismatch. During fan-out it fails only the offending target and names it, so the other workspaces still sync. Operator action when it fires: set the per-workspace token env (SLACK_<TEAM>_BOT_TOKEN, or bot_token_env under the matching [[workspaces]]). A shared global token can still be used, but only for the workspace it actually authenticates as.

Real behavior proof (redacted)

Run against disposable DB copies, workspace IDs redacted as T_A/T_B/T_C. Config has workspace_id = "T_A" (this is what triggered the bug).

1. Desktop ingest, before vs after. Same config, same desktop cache.

# pre-fix (upstream 905ac75)
$ slacrawl --config <cfg> sync --source desktop
  workspaces=1 | channels=85 | messages=1164      # only T_A ingested
  distinct workspaces in DB: 1

# this branch
$ slacrawl --config <cfg> sync --source desktop
  workspaces=3 | channels=126 | messages=1965     # T_A, T_B, T_C
  distinct workspaces in DB: 3

2. Fail-closed auth. Bot token authenticates to T_A; request T_FAKE.

$ slacrawl --config <cfg> sync --source bot --workspace T_FAKE
error: sync workspace T_FAKE: authenticated workspace T_A does not match requested workspace T_FAKE

rows labeled T_FAKE after the rejected run: 0

3. Purge scope. Copy of the three-workspace DB.

before:  T_A 1164 | T_B 526 | T_C 275

$ slacrawl purge --workspace T_B --older-than 1h --force     # scoped
after:   T_A 1164 | T_C 275                                  # only T_B removed

$ slacrawl purge --older-than 1h --force                     # no --workspace, archive-wide (preserved default)
after:   0

The hermetic tests stay in as regression cover, including TestRunSyncTargetsUsesWorkspaceTokensAndRejectsMismatchedAuth, which injects a fake Slack API and proves the mismatch rejection and that no rows get mislabeled.

go test ./... passes on Go 1.26.4.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 23, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. other P1 Urgent regression or broken agent/channel workflow affecting real users now. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants