Skip to content

CI flake: Production Docker (flex) sometimes fails with "application not healthy after 10m0s" #661

@kojiromike

Description

@kojiromike

What

The flex production docker job (build / Production Docker (flex)) intermittently fails with:

application not healthy after 10m0s
##[error]Process completed with exit code 1.

Container logs show mysql healthy and openemr started, but the openemr container's healthcheck never passes within the 600-second wait window.

Where it comes from

.github/actions/test-actions-core/action.yml runs:

- name: Run the containers
  run: docker compose up --detach --wait --wait-timeout 600 mysql "\${OPENEMR_SERVICE_NAME}"

docker compose up --wait polls the service healthcheck and exits non-zero with that message when the timeout elapses.

Recent occurrence

Run https://github.com/openemr/openemr-devops/actions/runs/24732488650 on master (commit 959f246, the merge of #660). Re-running the failed job alone passed without changes, confirming flake.

Why it likely flakes

The flex image runs composer install at container start. Under CI runner load (cold cache, slow mirror, contention), that can push total boot past 10 min. The wait-timeout is a budget, not a correctness check.

Suggested mitigations (pick one)

  • Raise the --wait-timeout. 20m would absorb most composer-install variance while still catching real hangs.
  • Warm the image. Move composer install into the image build (cached layer) rather than container startup, so healthcheck measures runtime boot only.
  • Retry the failed step once before failing the job. Cheapest fix, doesn't address root cause but silences the flake.
  • Split the healthcheck into cheap (apache up) + slow (installer done) signals, so compose's --wait succeeds quickly while a separate step verifies post-install state.

Priority

Low — flake is rare and resolved by a single re-run. Open to document the pattern and the mitigation options so the next person who hits it doesn't have to re-derive the diagnosis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions