Skip to content

Integrated Lifecycle Management for Hybrid Server #440

@sadhikariSteep

Description

@sadhikariSteep

Requested feature

Currently, the opendataloader-pdf library requires a manual background process (opendataloader-pdf-hybrid --port 5002) to be running in a separate terminal to handle high-accuracy table extraction. This is inefficient for production scripts, CI/CD pipelines, and local development.

Goal: Allow the Python wrapper to automatically detect, start, and shut down the hybrid server process as part of the convert() execution.

The "Ideal" Future Code

opendataloader_pdf.convert(
input_path="my_doc.pdf",
output_dir="output",
hybrid="docling-fast",
auto_start_server=True # 👈 The new feature
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions