41526 Storage [Backend] Create a file service#53
Conversation
…ion API - Add FileAttachment.access_type field with account/restricted options - Add FileAttachment.file_id field for unique file identification - Create FileAttachmentPermission model for user-specific file access - Implement AttachmentService.check_user_permission method - Add AttachmentsViewSet with check-permission endpoint - Add Redis caching for public template authentication - Add response_forbidden method to BaseResponseMixin - Include comprehensive test coverage for permission checking This enables fine-grained file access control with two access types: - account: accessible by all users in the same account - restricted: accessible only by users with explicit permissions
…integration - Add FastAPI-based file upload and download endpoints with streaming support - Implement Clean Architecture with domain entities, use cases, and repositories - Add authentication middleware with JWT token validation and Redis caching - Integrate Google Cloud Storage S3-compatible API for file storage - Add comprehensive error handling with custom exceptions and HTTP status codes - Implement file access permissions validation through external HTTP service - Add database models and Alembic migrations for file metadata storage - Include Docker containerization with docker-compose for local development - Add comprehensive test suite with unit, integration, and e2e tests - Configure pre-commit hooks with ruff, mypy, and pytest for code quality
…e_access_rights' into 41526__сreate_file_service # Conflicts: # backend/src/processes/enums.py # backend/src/processes/models/__init__.py # backend/src/processes/models/workflows/attachment.py # backend/src/processes/services/attachments.py # backend/src/processes/tests/test_services/test_attachments.py
… and implement caching for template data
….toml to dedicated config files - Move mypy configuration from pyproject.toml to mypy.ini for better separation of concerns - Simplify ruff.toml configuration by removing extensive rule selections and using "ALL" selector - Update ruff target version from py37 to py311 to match project Python version - Remove redundant ruff configuration from pyproject.toml to avoid duplication - Apply code formatting fixes across entire codebase - Standardize import statements and code style according to new linting rules - Update test files to comply with new formatting standards
…ore rule in ruff configuration - Update docstrings across various modules to ensure consistency and clarity. - Remove unused "D" rule from ruff.toml configuration. - Enhance readability and maintainability of the codebase.
…ling for consistency - Adjust import paths in test files to ensure they reference the correct locations. - Replace instances of FileNotFoundError with DomainFileNotFoundError for better clarity in exception handling. - Streamline fixture definitions and improve code readability across various test modules.
… configuration - Update docstrings across various test files for consistency and clarity. - Add new linting rules in ruff.toml for improved code quality. - Enhance readability and maintainability of the codebase by refining fixture definitions and mock implementations.
…line permission handling - Refactor the AuthenticationMiddleware to enhance error handling and response formatting. - Update permission classes to use direct Request type hints instead of string annotations. - Consolidate permission checks into FastAPI dependency wrappers for better clarity and usability. - Remove unused exception classes and error messages to clean up the codebase. - Adjust test cases to reflect changes in authentication and permission handling.
…exception tests - Remove unused infrastructure error codes from error_codes.py to streamline the codebase. - Update the AuthenticationMiddleware constructor to use direct FastAPI type hints for clarity. - Add new tests for validation exceptions, including file size and storage errors, to improve coverage and ensure accurate error handling.
…achment model and create FileAttachmentPermission model
…h SeaweedFS integration in Docker configuration
…02:8002 to 8002:8000 in Docker configurations
…ame and clean up GCS S3 environment variable assignments
…l-storage' in Docker Compose files
… to Docker Compose files
…s hierarchy Migrated from single Settings class with 7 scattered CONFIG string checks to BaseAppSettings -> TestingSettings / DevelopmentSettings / ProductionSettings class hierarchy. Changes: - config.py: class-based inheritance with environment-specific flags (HSTS_ENABLED, RATE_LIMIT_ENABLED, RELOAD, WORKERS) - main.py: removed all CONFIG string comparisons, docs always enabled, added root_path derived from FASTAPI_BASE_URL - security_headers.py: CSP updated to allow Swagger/ReDoc CDN resources - DI container + API: Settings -> BaseAppSettings type hints - Tests: CONFIG=Testing in conftest, updated imports
- Add SVG and WEBP to IMAGE_FILE_EXTENSIONS for correct <img> rendering - Implement filename-based fallback in getLinkEntityType for UUID URLs that lack file extensions (customMarkdownPlugins.ts) - Create getAttachmentEntityTypeByFilename to classify attachments by extension (Image/Video/File/Link) as single source of truth - Replace duplicated IMAGE_EXTENSION_RE regex in parseMarkdownFiles.ts with canonical getAttachmentTypeByFilename function - Add comprehensive unit/integration tests for new detection logic
- Add SVG and WEBP to IMAGE_FILE_EXTENSIONS for correct <img> rendering - Implement filename-based fallback in getLinkEntityType for UUID URLs that lack file extensions (customMarkdownPlugins.ts) - Create getAttachmentEntityTypeByFilename to classify attachments by extension (Image/Video/File/Link) as single source of truth - Replace duplicated IMAGE_EXTENSION_RE regex in parseMarkdownFiles.ts with canonical getAttachmentTypeByFilename function - Add comprehensive unit/integration tests for new detection logic
…igration - Add SVG and WEBP to IMAGE_FILE_EXTENSIONS for correct <img> rendering - Implement filename-based fallback in getLinkEntityType for UUID URLs that lack file extensions (customMarkdownPlugins.ts) - Create getAttachmentEntityTypeByFilename to classify attachments by extension (Image/Video/File/Link) as single source of truth - Replace duplicated IMAGE_EXTENSION_RE regex in parseMarkdownFiles.ts with canonical getAttachmentTypeByFilename function - Add comprehensive unit/integration tests for new detection logic
…service # Conflicts: # frontend/src/public/api/commonRequest.ts
…ror on 100MB uploads
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
There are 109 total unresolved issues (including 108 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7d3d6d5. Configure here.
| try: | ||
| token = PublicAuthService.get_token(raw_token) | ||
| if token: | ||
| auth_data = await PublicAuthService.authenticate_public_token( |
There was a problem hiding this comment.
Public cookie lacks Token prefix
Medium Severity
Public-token authentication from the public-token cookie passes the raw cookie value into PublicAuthService.get_token, which only accepts a two-part Token <value> header string. Browser clients store the bare token in that cookie, so public-form file requests using cookies fail auth unless the prefixed header is also sent.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 7d3d6d5. Configure here.


1. Description (Problem)
Pneumatic's file storage system previously relied on Google Cloud Storage (GCS) through a legacy Django
storagemodule. All file operations (upload/download) passed through the monolithic Django backend, creating performance bottlenecks, vendor lock-in, and no flexibility for self-hosted deployments.Issues addressed:
Attachmentmodel (guardian-based permissions) without support for public/guest token access2. Context
storage/) with its own DB, authentication via shared Redis (DRF-compatible tokens), and S3-compatible storage (SeaweedFS for local, GCS S3 API for cloud).FileAttachmentmodel,Attachmentmodel, workflow/task serializers, frontend upload components, nginx configuration, docker-compose.3. Solution
storage/microservice (FastAPI + SQLAlchemy + Alembic) — file upload/download via S3 APISTORAGE_TYPE/attachments/check-permissionparseMarkdownFilesutility for Workflow Log attachmentsFileAttachment→ newAttachmentmodel with scoped uniqueness4. Implementation Details
4.1 New
storage/Microservice (108 files, ~15K lines)api/files.pyPOST /upload,GET /{file_id}with Range supportuse_cases/UploadFileUseCase,DownloadFileUseCaseentities/file_record.pyFileRecordentityadapters/storage_service.pyauth/,middleware/,permissions.py,config.pyKey decisions:
BaseAppSettings→Testing/Development/Production)secure_filename(),sanitize_content_type()4.2 Django Backend (172 files, ~19K lines changed)
FileAttachment—file_id,access_type(migration0254)TaskFieldService— migrated to scoped uniqueness (composite:account_id+file_id)AttachmentService— updated refresh/clone attachment logic/attachments/check-permission— for file-servicefill_file_attachment_file_id,migrate_file_attachment_to_attachment,replace_storage_links_with_file_service,sync_files_to_file_serviceFileSyncViewSet, comment attachment logic,STORAGEfeature flag4.3 Frontend (79 files, ~3.2K lines)
uploadFiles.ts— switched to direct file-service uploadfileServiceUpload.ts— new API client for file-serviceparseMarkdownFiles.ts— utility for extracting files from Markdown (Workflow Log)getErrorMessage.ts— improved API error handlinggetConfig.ts— updated URL mapping configuration4.4 Infrastructure (12 files)
pneumatic_file_service,seaweedfs-*servicesproxy_to_file_service.conf, updated location blocksstart.sh— Alembic migration integration on startup5. What to Test
5.1 Preconditions
docker-compose up5.2 Positive Scenarios
Upload file to task:
/files/pathDownload file:
Kickoff with file field:
Public form:
Workflow Log — Attachments tab:
Image preview:
5.3 Negative Scenarios and Edge Cases
File too large (>100MB):
Unauthenticated access:
Access to another user's file:
File with non-ASCII name:
5.4 Verification Points
POST /files/uploadreturns{ public_url, file_id };GET /files/{file_id}streams file/files/endpoint, not old Django endpointpneumatic_file_servicecontainer is running and responsive5.5 API Checks
POST /files/upload— multipart/form-data, response{ "public_url": "...", "file_id": "..." }GET /files/{file_id}— headersContent-Disposition,Accept-Ranges,X-Content-Type-Options: nosniffGET /files/{file_id}withRange: bytes=0-1023→ 206 Partial ContentPOST /attachments/check-permissionon Django backend5.6 What Was NOT Tested
6. Testing Affected Areas (Dependencies)
7. Refactoring
backend/src/storage/) — replaced with new file-serviceSTORAGEfeature flag and related env variablesTaskFieldService— migrated from flat uniqueness to scoped (composite index)CONFIGstring checks to class-based settings hierarchyAdditional testing: all areas from section 6 — previous behavior must be preserved.
8. Release Notes
Added a new File Service microservice based on FastAPI, replacing the direct GCS integration. Supports local storage (SeaweedFS) and cloud storage (GCS S3 API). Includes authentication, rate limiting, security headers, access control, and public form support.
Note
[!NOTE]
Add a dedicated file storage microservice to replace Google Cloud Storage
storagemicroservice (storage/src/main.py) with upload/download endpoints, SeaweedFS/GCS S3 backends, JWT-based auth middleware, rate limiting, and security headers.FileAttachment-based attachment handling with a new DjangoAttachmentmodel (backend/src/storage/models.py) usingdjango-guardianobject permissions for fine-grained access control (PUBLIC / ACCOUNT / RESTRICTED).[name](url)) rather than integer attachment IDs;TaskFieldService,CommentService, and related serializers are updated accordingly.run_file_migration,fill_file_attachment_file_id,migrate_file_attachment_to_attachment,sync_files_to_file_service,replace_storage_links_with_file_service) to migrate existing GCS-hostedFileAttachmentrecords to the new service.file-service,file-postgres, and SeaweedFS stack services; the frontend uses a newfileServiceUrlconfig value and uploads viauploadFileToFileService.attachmentsfields are removed fromWorkflowEventSerializer,TaskFieldSerializer, and comment/workflow API payloads — clients that depend on these fields will break. The/workflows/attachmentsand/workflows/public/attachmentsAPI routes are also removed.Changes since #53 opened
default-src 'none'to a multi-directive policy allowing self, inline scripts and styles, and external resources from specific CDNs [53c7500]client_max_body_sizedirective from100mto105min nginx configuration files [7d3d6d5]file_service_authcookie from/files/to/inFileServiceAuthMiddleware.process_responsemiddleware hook [ce42e6f]Macroscope summarized 30c69a9.
Note
High Risk
Large cross-cutting change touching authentication (public token cache), file access control, and removed API fields (
attachmentson events/fields/comments). Production rollout depends on migration commands and coordinated file-service deployment.Overview
Replaces in-backend Google Cloud Storage with a dedicated file-service (wired via
FILE_SERVICE_URL/FileServiceClient) and local SeaweedFS stack indocker-compose, plus a separate file-postgres database.Upload paths for user/contact/integration photos, Microsoft Graph avatars, and account logos now go through the file service;
sync_account_file_fieldskeepsAttachmentrecords aligned when those URLs change. django-guardian and a customGroupObjectPermissionmodel back object-level file access; admin no longer triggers legacy bucket public/private Celery tasks.API and model shifts: workflow event/field serializers drop
attachments; comments require text only (no attachment IDs); file fields validate markdown link lists instead of integer attachment IDs. Analytics for uploads now key offstorage.Attachment/file_id. Public/embed auth caches templateaccount_idin Redis with invalidation when templates are deactivated or access is revoked.Migration tooling adds orchestrated commands (
run_file_migration, fillfile_id, migrateFileAttachment→Attachment, sync rows into the file DB, rewrite GCS URLs). LegacyFileAttachmentremains temporarily withfile_id/access_typefields.Repo hygiene: root
.pre-commit-config.yamlcovers backend andstorage/;google-cloud-storageremoved from backend dependencies (lockfile still pulls GCS libs transitively).Reviewed by Cursor Bugbot for commit 7d3d6d5. Bugbot is set up for automated code reviews on this repo. Configure here.