fix(sandbox): support extra mounts in provisioner mode#2487
fix(sandbox): support extra mounts in provisioner mode#2487officialasishkumar wants to merge 4 commits intobytedance:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends provisioner-mode sandbox creation to support user-configured “extra mounts” end-to-end (config → RemoteSandboxBackend → provisioner API → K8s Pod volumes/volumeMounts), aligning Kubernetes provisioner behavior with local container mode.
Changes:
- Forward configured sandbox mounts through
RemoteSandboxBackendto the provisionerPOST /api/sandboxesAPI. - Add
extra_mountsrequest handling in the provisioner and wire them into generated PodvolumesandvolumeMounts. - Add/update regression tests and documentation for provisioner-mode custom mounts.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| docker/provisioner/README.md | Documents provisioner API usage including extra_mounts. |
| docker/provisioner/app.py | Adds extra_mounts request model, validation/normalization, and Pod volume/volumeMount wiring. |
| config.example.yaml | Documents sandbox mounts configuration for provisioner mode. |
| backend/tests/test_remote_sandbox_backend.py | Verifies remote backend forwards configured mounts and filters provisioner-built-in runtime mounts. |
| backend/tests/test_provisioner_pvc_volumes.py | Adds tests ensuring extra mounts become hostPath volumes/mounts and validates rejection cases. |
| backend/packages/harness/deerflow/community/aio_sandbox/remote_backend.py | Serializes and forwards extra mounts to provisioner create API. |
| backend/packages/harness/deerflow/community/aio_sandbox/aio_sandbox_provider.py | Passes configured mounts to RemoteSandboxBackend when provisioner_url is set. |
| if not host_path: | ||
| raise ValueError("extra_mounts.host_path must not be empty") | ||
| if not _is_absolute_host_path(host_path): | ||
| raise ValueError(f"extra_mounts.host_path must be absolute: {host_path}") | ||
| if not container_path.startswith("/"): |
There was a problem hiding this comment.
extra_mounts.host_path currently allows any absolute path on the Kubernetes node. Since the provisioner API is unauthenticated and is reverse-proxied under /api/sandboxes by the repo nginx config, this enables mounting sensitive host paths (e.g. /etc, /var/run, kubeconfig) into sandbox Pods. Consider requiring an auth token for the provisioner API and/or restricting host_path to an allow-listed base directory (or disabling extra_mounts unless explicitly enabled via env).
There was a problem hiding this comment.
Added a provisioner-side host-path allowlist in ec0f656, with tests for disabled and outside-allowlist mounts.
| seen_container_paths = {"/mnt/skills", "/mnt/user-data"} | ||
|
|
||
| for mount in extra_mounts or []: | ||
| host_path = mount.host_path | ||
| container_path = mount.container_path.rstrip("/") or "/" |
There was a problem hiding this comment.
_normalize_extra_mounts() only reserves the exact /mnt/user-data path. That still permits mounts targeting subpaths like /mnt/user-data/workspace, which can shadow the built-in user-data mount and also diverges from RemoteSandboxBackend._is_provisioner_builtin_mount() (which treats /mnt/user-data/* as provisioner-owned). Consider rejecting any container_path that is /mnt/user-data or starts with /mnt/user-data/ (and similarly for /mnt/skills if you want to prevent shadowing).
There was a problem hiding this comment.
Updated both provisioner validation and backend serialization to treat /mnt/user-data, /mnt/user-data/*, /mnt/skills, and /mnt/skills/* as provisioner-owned targets. Commit: 6a8ca39
| name=f"extra-mount-{index}", | ||
| host_path=k8s_client.V1HostPathVolumeSource( | ||
| path=mount.host_path, | ||
| type="Directory", |
There was a problem hiding this comment.
Extra hostPath volumes are created with hostPath type Directory, which will fail Pod scheduling if the host directory does not already exist. If the intent is to make these mounts easy to configure (similar to the built-in user-data hostPath), consider using DirectoryOrCreate or explicitly documenting that host_path must pre-exist on every node that might run the Pod.
| type="Directory", | |
| type="DirectoryOrCreate", |
There was a problem hiding this comment.
Changed extra hostPath volumes to use DirectoryOrCreate and updated the regression expectation. Commit: 6a8ca39
|
|
||
| for mount in self._config_mounts: | ||
| mounts.append( | ||
| { | ||
| "host_path": mount.host_path, | ||
| "container_path": mount.container_path, | ||
| "read_only": mount.read_only, | ||
| } | ||
| ) | ||
|
|
||
| seen_container_paths = {str(mount["container_path"]).rstrip("/") or "/" for mount in mounts} |
There was a problem hiding this comment.
Config-provided mounts (self._config_mounts) are serialized without any normalization or filtering. If a user config includes provisioner-built-in targets (e.g. /mnt/skills or /mnt/user-data) or duplicates, the provisioner will reject the request (422) and sandbox creation will fail. Consider applying the same normalization/dedup logic used for runtime extra_mounts (strip trailing /, skip provisioner-built-in mount paths, and skip duplicates) to config mounts as well.
| for mount in self._config_mounts: | |
| mounts.append( | |
| { | |
| "host_path": mount.host_path, | |
| "container_path": mount.container_path, | |
| "read_only": mount.read_only, | |
| } | |
| ) | |
| seen_container_paths = {str(mount["container_path"]).rstrip("/") or "/" for mount in mounts} | |
| seen_container_paths: set[str] = set() | |
| for mount in self._config_mounts: | |
| normalized_container_path = mount.container_path.rstrip("/") or "/" | |
| if self._is_provisioner_builtin_mount(normalized_container_path): | |
| logger.warning("Skipping provisioner built-in config mount target: %s", mount.container_path) | |
| continue | |
| if normalized_container_path in seen_container_paths: | |
| logger.warning("Skipping duplicate provisioner config mount target: %s", mount.container_path) | |
| continue | |
| seen_container_paths.add(normalized_container_path) | |
| mounts.append( | |
| { | |
| "host_path": mount.host_path, | |
| "container_path": normalized_container_path, | |
| "read_only": mount.read_only, | |
| } | |
| ) |
There was a problem hiding this comment.
Applied the same normalization/dedup filtering to configured mounts before sending them to the provisioner, with a regression test for built-in and duplicate config targets. Commit: 6a8ca39
|
@officialasishkumar, thanks for your contribution. Please take a look at the review comments. |
407aebd to
6a8ca39
Compare
|
@officialasishkumar Please fix the lint error of the code. |
Pass configured and runtime sandbox mounts through the remote provisioner API, build Kubernetes hostPath volumes and mounts for them, and document the provisioner configuration. Also removes stale task_tool monkeypatches that referenced a removed skill prompt helper so backend tests reflect the current subagent skill-loading path.
6a8ca39 to
70c1a31
Compare
|
@WillemJiang Fixed the backend lint formatting and pushed |
Summary
RemoteSandboxBackendto the provisioner create API.extra_mountsrequest support and wire those mounts into Kubernetes hostPath volumes and container volumeMounts./mnt/skillsand/mnt/user-data/*mounts from being duplicated while preserving runtime mounts such as/mnt/acp-workspace.Root Cause
AioSandboxProviderhanded configured mounts only toLocalContainerBackend; provisioner mode created pods from a fixed manifest andRemoteSandboxBackenddid not send mount metadata. As a result,sandbox.mountsworked in local container mode but was ignored whenprovisioner_urlselected the Kubernetes provisioner backend.Testing
cd backend && make lintcd backend && make test→ 2047 passed, 15 skippedcd backend && uv run pytest tests/test_remote_sandbox_backend.py tests/test_provisioner_pvc_volumes.py tests/test_task_tool_core_logic.py::test_task_tool_propagates_tool_groups_to_subagent tests/test_task_tool_core_logic.py::test_task_tool_no_tool_groups_passes_none tests/test_task_tool_core_logic.py::test_task_tool_runtime_none_passes_groups_none -q→ 28 passedCloses #2480