Skip to content

helm: fix vendor.dcgm.enabled: false has no effect#214

Open
shravyavorugallu wants to merge 1 commit into
SlinkyProject:mainfrom
shravyavorugallu:fix-vendor-dcgm-enabled-false
Open

helm: fix vendor.dcgm.enabled: false has no effect#214
shravyavorugallu wants to merge 1 commit into
SlinkyProject:mainfrom
shravyavorugallu:fix-vendor-dcgm-enabled-false

Conversation

@shravyavorugallu

Copy link
Copy Markdown

Fixes #210

Root Cause

The vendor.dcgm.enabled helper in _helpers.tpl used Sprig's ternary function via a pipeline:

{{- ($vendor | dig "nvidia" "dcgm" "enabled" false) | ternary "true" "" -}}

When vendor.nvidia.dcgm.enabled: false is set, dig returns a Go bool(false). Sprig's ternary reflects this value and can evaluate it as truthy, causing DCGM prolog/epilog ConfigMap refs to always be injected into the Controller CR.

This breaks CPU-only deployments (minikube, non-GPU nodes) because the operator tries to reconcile DCGM prolog scripts that were never created.

Fix

Replace ternary with a native Go template if block, which correctly evaluates boolean false as falsy in all Helm versions:

{{- if ($vendor | dig "nvidia" "dcgm" "enabled" false) -}}
true
{{- end -}}

The rendered output is unchanged: "true" when enabled, "" when disabled.

Fixes SlinkyProject#210

The vendor.dcgm.enabled helper used Helm's ternary function via a pipe:

    ($vendor | dig "nvidia" "dcgm" "enabled" false) | ternary "true" ""

Helm's ternary receives the piped value as the condition argument.
When the piped value is a Go boolean false (as returned by dig when
vendor.nvidia.dcgm.enabled is explicitly set to false), Sprig's
ternary implementation reflects the value and can evaluate it as
truthy, causing the DCGM prolog/epilog configmap refs to always be
injected into the Controller CR regardless of the setting.

This breaks CPU-only clusters (minikube, non-GPU nodes) because the
operator tries to mount DCGM prolog scripts that were never created.

Fix: replace ternary with a native Go template if block. The native if
correctly evaluates boolean false as falsy in all Helm versions:

    {{- if ($vendor | dig "nvidia" "dcgm" "enabled" false) -}}
    true
    {{- end -}}

The rendered output of vendor.dcgm.enabled is unchanged (returns
"true" when enabled, "" when disabled), so all callsites in
controller-cr.yaml and the configmap templates are unaffected.

Signed-off-by: Shravya Vorugallu <shravyavorugallu@gmail.com>
@vivian-hafener

Copy link
Copy Markdown
Contributor

Good morning,

Your issue indicates that this bug is present on v1.1.0, but the root cause you present as problematic is only present on main. On v1.1.0, the logic is as follows:

{{/*
Check if DCGM integration is enabled
*/}}
{{- define "vendor.dcgm.enabled" -}}
{{- .Values.vendor.nvidia.dcgm.enabled -}}
{{- end }}

The logic in v1.1.0 was broken, which did result in vendor.dcgm.enabled: false having no effect. However, I believe that this was resolved by f1786e4 on main.

If you can confirm that that commit does resolve the behavior that you are seeing, I can look into backporting that commit to release-1.0 and release-1.1.

Best,
Vivian Hafener

@vivian-hafener vivian-hafener requested review from wickberg and removed request for wickberg June 16, 2026 16:09
@shravyavorugallu

shravyavorugallu commented Jun 18, 2026 via email

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: vendor.nvidia.dcgm.enabled: false has no effect — DCGM prolog always wired into prologScriptRefs

2 participants