The process command supports archive files (ZIP, TAR.GZ, TGZ) containing multiple DICOM files for batch processing.
- ZIP (
.zip) - TAR GZIP (
.tar.gz,.tgz)
- Upload: Archive file is uploaded to GCS with timestamped, hashed name
- Extraction: CloudRun service receives notification and extracts the archive
- Processing: Each DICOM file is processed individually (metadata extraction, embeddings, etc.)
- Results: Each file creates a separate BigQuery entry under the archive's base path
- Query by exact GCS path
- Return immediately when found
- Expects 1 result
- Query by timestamp range (±1 minute around upload time)
- Wait for first result, then continue collecting for 5 additional seconds to capture all extracted files
- Expects multiple results
Archives require additional time for extraction. Timeout is calculated as:
total_timeout = base_timeout + (file_size_mb × timeout_per_mb) + 30000ms
Defaults:
- Base timeout: 60 seconds
- Per-MB timeout: 10 seconds
- Archive extraction bonus: 30 seconds
Examples:
- Single 1 MB file: ~70 seconds
- 10 MB archive: ~130 seconds
- 100 MB archive: ~1030 seconds
# Single DICOM file
node src/index.js process patient-study.dcm --config deployment-config.json
# ZIP archive
node src/index.js process studies-batch.zip --config deployment-config.json
# TAR.GZ archive
node src/index.js process studies-batch.tar.gz --config deployment-config.json
# Custom timeout for large archives
node src/index.js process large-batch.zip --config deployment-config.json \
--poll-timeout 180000 --poll-timeout-per-mb 15000Processing DICOM file: patient-study.dcm
=== Processing Result Overview ===
Path: gs://bucket/uploads/1705089600000_a1b2c3d4_patient-study.dcm
Patient Name: Doe^John
Modality: CT
Study Date: 2024-01-12
Series Description: CT Chest
===================================
Processing archive: studies-batch.zip
Note: Archive files are expanded and processed as separate DICOM files
=== Archive Processing Results ===
Total files processed: 3
--- Result 1 ---
Path: gs://bucket/uploads/1705089600000_a1b2c3d4_studies-batch.zip/study1.dcm
Patient Name: Smith^Jane
Modality: MR
Study Date: 2024-01-10
--- Result 2 ---
Path: gs://bucket/uploads/1705089600000_a1b2c3d4_studies-batch.zip/study2.dcm
Patient Name: Johnson^Bob
Modality: CT
Study Date: 2024-01-11
--- Result 3 ---
Path: gs://bucket/uploads/1705089600000_a1b2c3d4_studies-batch.zip/study3.dcm
Patient Name: Williams^Alice
Modality: US
Study Date: 2024-01-12
--- Summary ---
Total size: 450 MB
Modalities: CT, MR, US
===================================
The system automatically detects archive files by extension:
- Checks for
.zip,.tgz, or.tar.gzextensions - Applies archive-specific timeout and polling logic
- Single files: Returns array with 1 result
- Archives: Returns array with N results (one per extracted DICOM)
- Both cases use the same display logic for consistent output
The command uses BigQuery to retrieve results. You can manually run these queries if needed:
Single file query:
SELECT * FROM `PROJECT.DATASET.instances`
WHERE path = 'gs://bucket/uploads/1705089600000_a1b2c3d4_file.dcm'
LIMIT 1Archive query (retrieve all files from an archive):
SELECT * FROM `PROJECT.DATASET.instances`
WHERE timestamp >= TIMESTAMP('2024-01-12T15:40:00Z')
AND timestamp <= TIMESTAMP('2024-01-12T15:50:00Z')
AND metadata IS NOT NULL
ORDER BY timestamp DESC- Check CloudRun service logs for processing errors
- Verify the configuration file is correct and has valid GCP credentials
- For large archives, consider increasing
--poll-timeout - Verify DICOM files are valid and readable
- Ensure you're passing
--config deployment-config.json - Run the deployment helper script to generate a valid config:
./helpers/deploy.sh my-project
- Regenerate the configuration file using the deployment helper
- Check config file for required fields:
projectId,bucketName,datasetId,instancesTableId
- Check CloudRun logs for processing errors during metadata extraction
- Verify embeddings are not causing timeouts (check Gemini/Vertex AI quotas)
- Ensure BigQuery table has sufficient quota for insertions
- Process Command Guide - General usage of the process command
- Quick Reference - Quick examples and checklists
- Test Coverage - Unit and integration test details