Skip to content

[opt][qir] Lower !cc.measure_handle through ConvertToQIRAPI#4404

Merged
khalatepradnya merged 7 commits into
NVIDIA:mainfrom
khalatepradnya:pkhalate/measure-handle-pr2-qir-conversion
Apr 30, 2026
Merged

[opt][qir] Lower !cc.measure_handle through ConvertToQIRAPI#4404
khalatepradnya merged 7 commits into
NVIDIA:mainfrom
khalatepradnya:pkhalate/measure-handle-pr2-qir-conversion

Conversation

@khalatepradnya

Copy link
Copy Markdown
Collaborator

Summary

  • Lower !cc.measure_handle to its i64 payload through --convert-to-qir-api's existing TypeConverter, completing the IR-side of the cudaq::measure_handle feature.
  • Builds on [cc] Measure handle type #4403.

Motivation

#4403 introduced !cc.measure_handle as IR vocabulary; nothing yet routes it to QIR. This PR adds the converter rule plus boundary bridging on quake.mz (which still calls a QIR function returning Result*) and quake.discriminate (which still consumes Result*), so handle-form kernels reach the QIR pipeline as i64 payloads through the same TypeConverter machinery the rest of QIR conversion already uses.

What Changed

  • QIRAPITypeConverter gains three addConversion rules: !cc.measure_handle -> i64, plus recursive descent through !cc.array<...> and !cc.stdvec<...> so container-shaped function signatures, allocations, and pointers see consistent post-conversion element types. The !cc.ptr<...> recursion was already in place.
  • MeasurementOpPattern detects when the original quake.mz produced a handle (its measOut is !cc.measure_handle) and emits a cc.cast Result* -> i64 so downstream uses see the converted payload. The cast is materialized in the mz call's block, ahead of the optional terminator-relative insertion point used for record-output, so it dominates downstream quake.discriminate uses.
  • DiscriminateOpToCallRewrite mirrors this on the read side: when the post-conversion operand is integer-typed it emits cc.cast i64 -> Result* before delegating to the existing read-result lowering. In the full-QIR (!discriminateToClassical) branch the bridge cast and the inner double-cast fold against each other, leaving a single cc.cast i64 -> !cc.ptr<i1> + cc.load.
  • ExpandMeasurements accepts !cc.measure_handle alongside !quake.measure in usesIndividualQubit, so single-qubit handle measurements aren't rewritten as registers.
  • Predicate rename: the misnamed hasQuakeType is now needsTypeConversion, leaf check extended to include !cc.measure_handle, recursion extended to descend through !cc.array/!cc.stdvec. The old name was incorrect — it has always reported "this op carries a type the converter rewrites," not "this op carries a Quake type."
  • Test: test/Transforms/qir_api_measure_handle.qke covering scalar handle measurement + discriminate, function signature with handle parameter and return, cc.alloca of a scalar handle, static- and dynamic-size arrays of handles, cc.stdvec<!cc.measure_handle> in a function signature, cc.indirect_callable<() -> !cc.measure_handle>, and a no-handle negative.

Risks

  • cc.loop iter-args carrying !cc.measure_handle are not exercised by the conversion's region-aware patterns. Low immediate risk because no current frontend or test produces such IR; flagged as a follow-up.
  • Container types beyond cc.array/cc.stdvec (e.g., a cc.struct with a handle field) are not in the converter's recursion. None of the current frontends produce these; not a regression vs. the prototype.

Downstream Impact

  • CUDA-QX: none.
  • Public API: none.
  • Stack: the next PR adds C++/Python frontend bindings that produce handle-form IR, which this PR now correctly routes.

Adds the IR alias of `cudaq::measure_handle`: an opaque, word-sized
classical type whose only meaningful consumer is `quake.discriminate`.
Registers the type in the CC dialect, lowers it to `i64` in the
CC->LLVM type converter, and extends `cc.cast`'s verifier to permit
`i64 <-> !cc.measure_handle` (no-op casts modeling the i64 payload).

The type carries no payload yet -- ODS widening of the Quake
measurement / DiscriminateOp signatures lands in the next commit, and
the QIR conversion plus C++/Python frontend bindings land in follow-up
PRs. Round-trip coverage is appended to
`test/Transforms/roundtrip-ops.qke`.

Made-with: Cursor
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
ODS: the result of `quake.mz`/`mx`/`my` and the operand of
`quake.discriminate` may now be `!cc.measure_handle` (single qubit) or
`!cc.stdvec<!cc.measure_handle>` (register), in addition to the
existing `!quake.measure` / `!cc.stdvec<!quake.measure>` forms. The
two forms are interchangeable from the compiler's point of view; the
handle form is emitted by the bridge for `cudaq::measure_handle`
callers and is lowered to `i64` by `--convert-to-qir-api`'s
`TypeConverter` in a follow-up PR.

Verifier: `verifyMeasurements` and `DiscriminateOp::verify` accept
either classical type. The arity diagnostic now references both
spellings so users see why a scalar-typed result is rejected when
measuring a register.

No existing measurement / discriminate FileCheck tests reference
the previous diagnostic strings.

Made-with: Cursor
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
@khalatepradnya khalatepradnya force-pushed the pkhalate/measure-handle-pr2-qir-conversion branch from 84927ce to 9c2374a Compare April 28, 2026 19:16
Lower `!cc.measure_handle` -- the IR alias of `cudaq::measure_handle` --
to its `i64` payload as part of the existing QIR conversion, rather than
through a separate imperative pre-pass. The handle is semantically an
opaque integer-shaped measurement token, so it is registered like any
other type rewrite on `QIRAPITypeConverter` and the existing
`MeasurementOpPattern` / `DiscriminateOpToCallRewrite` patterns handle
the `Result*` <-> `i64` bridging at the op boundary.

Concretely:

  * `QIRAPITypeConverter` gains three `addConversion` rules:
    `!cc.measure_handle -> i64`, plus recursive descent through
    `!cc.array<...>` and `!cc.stdvec<...>` so container-shaped function
    signatures, allocations, and pointers see consistent post-conversion
    element types. The recursion through `!cc.ptr` was already in place.

  * `MeasurementOpPattern` (in both `mzReturnsResultType` branches)
    detects when the original `quake.mz` produced a handle and emits a
    `cc.cast Result* -> i64` so downstream uses see the converted
    payload. The cast is materialized in the mz call's block, ahead of
    the optional terminator-relative insertion point used for record
    output, so it dominates downstream discriminate uses.

  * `DiscriminateOpToCallRewrite` mirrors this on the read side: when
    the (post-conversion) operand is integer-typed it emits a
    `cc.cast i64 -> Result*` before delegating to the existing
    read-result lowering. In the full-QIR (`!discriminateToClassical`)
    branch the bridge cast and the inner double-cast fold against each
    other, which is the desired behavior.

  * `ExpandMeasurements` accepts `!cc.measure_handle` alongside
    `!quake.measure` in `usesIndividualQubit` so single-qubit handle
    measurements are not rewritten as registers.

  * `hasQuakeType` is renamed `needsTypeConversion` and extended to
    treat `!cc.measure_handle` as a leaf that requires conversion, and
    to descend through `!cc.array` and `!cc.stdvec` (matching the new
    converter rules). The old name was a misnomer: the predicate has
    always reported "this op carries a type the converter rewrites,"
    not "this op carries a Quake type."

A focused FileCheck test exercises scalar handle measurement and
discriminate, function signatures (including parameter, return,
indirect-callable, and stdvec forms), and allocations of static and
dynamic handle-typed arrays.

This replaces the prototype's `lower-cc-measure-handle` pass: it is
fewer lines, removes the matching ODS widening on `quake.mz` /
`quake.discriminate`, and routes through the canonical MLIR
`TypeConverter` pipeline that the rest of QIR conversion already uses.

Made-with: Cursor
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
@khalatepradnya khalatepradnya force-pushed the pkhalate/measure-handle-pr2-qir-conversion branch from 9c2374a to e64d810 Compare April 28, 2026 19:16
@khalatepradnya khalatepradnya changed the title [opt][qir] Lower !cc.measure_handle through ConvertToQIRAPI (builds on #4403) [opt][qir] Lower !cc.measure_handle through ConvertToQIRAPI (builds on #4403) Apr 28, 2026
@schweitzpgi

Copy link
Copy Markdown
Collaborator

Heads-up: we may want to do the LLVM-22 rebase first. Not sure.

khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request Apr 29, 2026
Mechanical migration of AST-Quake lit CHECKs from the legacy
`!quake.measure` / `!cc.stdvec<!quake.measure>` shape to the
spec-mandated `!cc.measure_handle` / `!cc.stdvec<!cc.measure_handle>`
shape produced by the bridge after PR 3b Commit 1 (mz / mx / my emit
handles directly, deferring discrimination to bool-conversion sites).

Affects 37 tests across measurement, control-flow, vector, tuple,
ctor, indirect_callable, separate_compilation, qalloc_state, and
cudaq_run scenarios. CHECKs that previously matched the bool-fold
synthesized for `auto x = mz(qview)` (i1 byte buffer + discriminate
+ cast + store) are dropped where the consumer is absent: the new
expand-measurements behavior synthesizes the bool buffer only when a
discriminate consumer is present (PR 3a), so unused vector
measurements now produce just the per-element handle ops.

Adjacent fixups:
- test/AST-Quake/cudaq_run.cpp: drop the stale
  `--lower-cc-measure-handle` cudaq-opt pass from the RUN line. Under
  Option C (PR NVIDIA#4404) the !cc.measure_handle conversion is folded
  into ConvertToQIRAPI's TypeConverter; the standalone pass was
  prototype-only and no longer exists.
- test/AST-Quake/base_profile-1.cpp -> test/AST-Quake/qir_profiles.cpp:
  rename for aptness (file already exercises BASE, ADAPT, and FULL
  QIR profiles, not just the base profile). While here, remove six
  stale `read_result__body` ADAPT CHECKs: the legacy bridge always
  discriminated `auto x = mz(q)`, but the spec API does not, so unused
  handles produce no `read_result__body`. The mz / record_output pairs
  remain.

Source code (the C++ kernels themselves) is untouched; this is a
pure CHECK-line migration. AST-Quake suite returns to 109/109 pass
(plus 2 pre-existing XFAIL).

Spec: cudaq-spec/proposals/measure_handle.bs.
Made-with: Cursor
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request Apr 29, 2026
Two test additions guarding the spec-aligned single-API surface
(cudaq-spec/proposals/measure_handle.bs).

test/AST-Quake/measure_handle.cpp -- replace the single
CrossFunctionCaller smoke test with nine focused bridge cases:

  1. ScalarHandle:    mz(qubit) -> !cc.measure_handle, no inlined
                      discriminate when the result is unused.
  2. ScalarBool:      return-as-bool inserts cc.alloca/store/load +
                      single quake.discriminate.
  3. DirectIf:        if (mz(q)) ... discriminates at the cc.if cond.
  4. RegisterHandle:  mz(qvec) -> !cc.stdvec<!cc.measure_handle>,
                      no inlined discriminate.
  5. RegisterBools:   to_bools(mz(qvec)) lowers to the vectorized
                      quake.discriminate consuming a stdvec of handles
                      and producing !cc.stdvec<i1>.
  6. MxMyHandles:     mx / my parity with mz on the scalar form.
  7. CrossFunctionCaller (kept): pure-device passage of a
                      const measure_handle& through cc.alloca/cc.store
                      and a !cc.ptr<!cc.measure_handle> parameter.
  8. HandleEquality:  h1 == h2 discriminates each operand independently,
                      the comparison runs on i1 -- semantics is outcome
                      equality, not handle identity.
  9. HandleAnd:       short-circuit && places the second mz +
                      discriminate inside the cc.if else branch of the
                      short-circuit lowering.

Together these lock in the bool-conversion context coverage required
by the spec (Operational Semantics: discriminate at every implicit
bool coercion: assignment, return, if/while/for, !, ==, &&, ||).

test/AST-Quake/measure_handle_qir.cpp -- new end-to-end regression
running cudaq-quake | cudaq-opt --expand-measurements --canonicalize
--convert-to-qir-api --symbol-dce. Two kernels exercise the two
host-device boundary shapes the spec allows:

  * ScalarReturn:  bool return from mz(qubit). After ConvertToQIRAPI
                   (Option C, PR NVIDIA#4404) the !cc.measure_handle folds to
                   i64 inside the type converter; the bridge's
                   alloca/store/load + discriminate splice survives the
                   conversion as alloca i64 / store i64 / load i64 /
                   cast i64 -> ptr<i1> / load i1.
  * VectorReturn:  std::vector<bool> from to_bools(mz(qvec)). The
                   vectorized quake.discriminate is unrolled by
                   expand-measurements into per-qubit mz + discriminate;
                   the QIR API emits the standard
                   __nvqpp_vectorCopyCtor / cc.stdvec_init heap-copy
                   prologue.

Notes
- The RUN line for measure_handle_qir.cpp drops the prototype's
  --lower-cc-measure-handle pass (does not exist under Option C).
- A loop-body CHECK uses {{.*}}!llvm.struct<"Result" instead of an
  exact -> !cc.ptr<!llvm.struct<"Result", opaque>> tail because the
  full type signature inside the loop has nested <...> brackets that
  trip FileCheck's greedy regex backtracking.

Spec: cudaq-spec/proposals/measure_handle.bs (Operational Semantics,
Kernel Signature Rule).

Made-with: Cursor
Signed-off-by: Pradnya Khalate <148914294+khalatepradnya@users.noreply.github.com>
@khalatepradnya khalatepradnya changed the title [opt][qir] Lower !cc.measure_handle through ConvertToQIRAPI (builds on #4403) [opt][qir] Lower !cc.measure_handle through ConvertToQIRAPI Apr 29, 2026
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
@khalatepradnya khalatepradnya force-pushed the pkhalate/measure-handle-pr2-qir-conversion branch from 33ed912 to e0b1876 Compare April 29, 2026 23:18
@khalatepradnya khalatepradnya marked this pull request as ready for review April 29, 2026 23:29
@khalatepradnya khalatepradnya requested review from 1tnguyen, atgeller, sacpis and schweitzpgi and removed request for schweitzpgi April 29, 2026 23:55
Comment thread lib/Optimizer/CodeGen/ConvertToQIRAPI.cpp Outdated

@1tnguyen 1tnguyen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@sacpis

sacpis commented Apr 30, 2026

Copy link
Copy Markdown
Collaborator

Reviewing it now...

@khalatepradnya khalatepradnya added this pull request to the merge queue Apr 30, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 30, 2026
@sacpis sacpis added this pull request to the merge queue Apr 30, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 30, 2026
@khalatepradnya khalatepradnya added this pull request to the merge queue Apr 30, 2026
Merged via the queue into NVIDIA:main with commit ee3ea51 Apr 30, 2026
197 checks passed
@khalatepradnya khalatepradnya deleted the pkhalate/measure-handle-pr2-qir-conversion branch April 30, 2026 22:13
github-actions Bot pushed a commit that referenced this pull request Apr 30, 2026
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 1, 2026
Mechanical migration of AST-Quake lit CHECKs from the legacy
`!quake.measure` / `!cc.stdvec<!quake.measure>` shape to the
spec-mandated `!cc.measure_handle` / `!cc.stdvec<!cc.measure_handle>`
shape produced by the bridge after PR 3b Commit 1 (mz / mx / my emit
handles directly, deferring discrimination to bool-conversion sites).

Affects 37 tests across measurement, control-flow, vector, tuple,
ctor, indirect_callable, separate_compilation, qalloc_state, and
cudaq_run scenarios. CHECKs that previously matched the bool-fold
synthesized for `auto x = mz(qview)` (i1 byte buffer + discriminate
+ cast + store) are dropped where the consumer is absent: the new
expand-measurements behavior synthesizes the bool buffer only when a
discriminate consumer is present (PR 3a), so unused vector
measurements now produce just the per-element handle ops.

Adjacent fixups:
- test/AST-Quake/cudaq_run.cpp: drop the stale
  `--lower-cc-measure-handle` cudaq-opt pass from the RUN line. Under
  Option C (PR NVIDIA#4404) the !cc.measure_handle conversion is folded
  into ConvertToQIRAPI's TypeConverter; the standalone pass was
  prototype-only and no longer exists.
- test/AST-Quake/base_profile-1.cpp -> test/AST-Quake/qir_profiles.cpp:
  rename for aptness (file already exercises BASE, ADAPT, and FULL
  QIR profiles, not just the base profile). While here, remove six
  stale `read_result__body` ADAPT CHECKs: the legacy bridge always
  discriminated `auto x = mz(q)`, but the spec API does not, so unused
  handles produce no `read_result__body`. The mz / record_output pairs
  remain.

Source code (the C++ kernels themselves) is untouched; this is a
pure CHECK-line migration. AST-Quake suite returns to 109/109 pass
(plus 2 pre-existing XFAIL).

Spec: cudaq-spec/proposals/measure_handle.bs.
Made-with: Cursor
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 1, 2026
Two test additions guarding the spec-aligned single-API surface
(cudaq-spec/proposals/measure_handle.bs).

test/AST-Quake/measure_handle.cpp -- replace the single
CrossFunctionCaller smoke test with nine focused bridge cases:

  1. ScalarHandle:    mz(qubit) -> !cc.measure_handle, no inlined
                      discriminate when the result is unused.
  2. ScalarBool:      return-as-bool inserts cc.alloca/store/load +
                      single quake.discriminate.
  3. DirectIf:        if (mz(q)) ... discriminates at the cc.if cond.
  4. RegisterHandle:  mz(qvec) -> !cc.stdvec<!cc.measure_handle>,
                      no inlined discriminate.
  5. RegisterBools:   to_bools(mz(qvec)) lowers to the vectorized
                      quake.discriminate consuming a stdvec of handles
                      and producing !cc.stdvec<i1>.
  6. MxMyHandles:     mx / my parity with mz on the scalar form.
  7. CrossFunctionCaller (kept): pure-device passage of a
                      const measure_handle& through cc.alloca/cc.store
                      and a !cc.ptr<!cc.measure_handle> parameter.
  8. HandleEquality:  h1 == h2 discriminates each operand independently,
                      the comparison runs on i1 -- semantics is outcome
                      equality, not handle identity.
  9. HandleAnd:       short-circuit && places the second mz +
                      discriminate inside the cc.if else branch of the
                      short-circuit lowering.

Together these lock in the bool-conversion context coverage required
by the spec (Operational Semantics: discriminate at every implicit
bool coercion: assignment, return, if/while/for, !, ==, &&, ||).

test/AST-Quake/measure_handle_qir.cpp -- new end-to-end regression
running cudaq-quake | cudaq-opt --expand-measurements --canonicalize
--convert-to-qir-api --symbol-dce. Two kernels exercise the two
host-device boundary shapes the spec allows:

  * ScalarReturn:  bool return from mz(qubit). After ConvertToQIRAPI
                   (Option C, PR NVIDIA#4404) the !cc.measure_handle folds to
                   i64 inside the type converter; the bridge's
                   alloca/store/load + discriminate splice survives the
                   conversion as alloca i64 / store i64 / load i64 /
                   cast i64 -> ptr<i1> / load i1.
  * VectorReturn:  std::vector<bool> from to_bools(mz(qvec)). The
                   vectorized quake.discriminate is unrolled by
                   expand-measurements into per-qubit mz + discriminate;
                   the QIR API emits the standard
                   __nvqpp_vectorCopyCtor / cc.stdvec_init heap-copy
                   prologue.

Notes
- The RUN line for measure_handle_qir.cpp drops the prototype's
  --lower-cc-measure-handle pass (does not exist under Option C).
- A loop-body CHECK uses {{.*}}!llvm.struct<"Result" instead of an
  exact -> !cc.ptr<!llvm.struct<"Result", opaque>> tail because the
  full type signature inside the loop has nested <...> brackets that
  trip FileCheck's greedy regex backtracking.

Spec: cudaq-spec/proposals/measure_handle.bs (Operational Semantics,
Kernel Signature Rule).

Made-with: Cursor
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 1, 2026
Mechanical migration of AST-Quake lit CHECKs from the legacy
`!quake.measure` / `!cc.stdvec<!quake.measure>` shape to the
spec-mandated `!cc.measure_handle` / `!cc.stdvec<!cc.measure_handle>`
shape produced by the bridge after PR 3b Commit 1 (mz / mx / my emit
handles directly, deferring discrimination to bool-conversion sites).

Affects 37 tests across measurement, control-flow, vector, tuple,
ctor, indirect_callable, separate_compilation, qalloc_state, and
cudaq_run scenarios. CHECKs that previously matched the bool-fold
synthesized for `auto x = mz(qview)` (i1 byte buffer + discriminate
+ cast + store) are dropped where the consumer is absent: the new
expand-measurements behavior synthesizes the bool buffer only when a
discriminate consumer is present (PR 3a), so unused vector
measurements now produce just the per-element handle ops.

Adjacent fixups:
- test/AST-Quake/cudaq_run.cpp: drop the stale
  `--lower-cc-measure-handle` cudaq-opt pass from the RUN line. Under
  Option C (PR NVIDIA#4404) the !cc.measure_handle conversion is folded
  into ConvertToQIRAPI's TypeConverter; the standalone pass was
  prototype-only and no longer exists.
- test/AST-Quake/base_profile-1.cpp -> test/AST-Quake/qir_profiles.cpp:
  rename for aptness (file already exercises BASE, ADAPT, and FULL
  QIR profiles, not just the base profile). While here, remove six
  stale `read_result__body` ADAPT CHECKs: the legacy bridge always
  discriminated `auto x = mz(q)`, but the spec API does not, so unused
  handles produce no `read_result__body`. The mz / record_output pairs
  remain.

Source code (the C++ kernels themselves) is untouched; this is a
pure CHECK-line migration. AST-Quake suite returns to 109/109 pass
(plus 2 pre-existing XFAIL).

Spec: cudaq-spec/proposals/measure_handle.bs.
Made-with: Cursor
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 1, 2026
Two test additions guarding the spec-aligned single-API surface
(cudaq-spec/proposals/measure_handle.bs).

test/AST-Quake/measure_handle.cpp -- replace the single
CrossFunctionCaller smoke test with nine focused bridge cases:

  1. ScalarHandle:    mz(qubit) -> !cc.measure_handle, no inlined
                      discriminate when the result is unused.
  2. ScalarBool:      return-as-bool inserts cc.alloca/store/load +
                      single quake.discriminate.
  3. DirectIf:        if (mz(q)) ... discriminates at the cc.if cond.
  4. RegisterHandle:  mz(qvec) -> !cc.stdvec<!cc.measure_handle>,
                      no inlined discriminate.
  5. RegisterBools:   to_bools(mz(qvec)) lowers to the vectorized
                      quake.discriminate consuming a stdvec of handles
                      and producing !cc.stdvec<i1>.
  6. MxMyHandles:     mx / my parity with mz on the scalar form.
  7. CrossFunctionCaller (kept): pure-device passage of a
                      const measure_handle& through cc.alloca/cc.store
                      and a !cc.ptr<!cc.measure_handle> parameter.
  8. HandleEquality:  h1 == h2 discriminates each operand independently,
                      the comparison runs on i1 -- semantics is outcome
                      equality, not handle identity.
  9. HandleAnd:       short-circuit && places the second mz +
                      discriminate inside the cc.if else branch of the
                      short-circuit lowering.

Together these lock in the bool-conversion context coverage required
by the spec (Operational Semantics: discriminate at every implicit
bool coercion: assignment, return, if/while/for, !, ==, &&, ||).

test/AST-Quake/measure_handle_qir.cpp -- new end-to-end regression
running cudaq-quake | cudaq-opt --expand-measurements --canonicalize
--convert-to-qir-api --symbol-dce. Two kernels exercise the two
host-device boundary shapes the spec allows:

  * ScalarReturn:  bool return from mz(qubit). After ConvertToQIRAPI
                   (Option C, PR NVIDIA#4404) the !cc.measure_handle folds to
                   i64 inside the type converter; the bridge's
                   alloca/store/load + discriminate splice survives the
                   conversion as alloca i64 / store i64 / load i64 /
                   cast i64 -> ptr<i1> / load i1.
  * VectorReturn:  std::vector<bool> from to_bools(mz(qvec)). The
                   vectorized quake.discriminate is unrolled by
                   expand-measurements into per-qubit mz + discriminate;
                   the QIR API emits the standard
                   __nvqpp_vectorCopyCtor / cc.stdvec_init heap-copy
                   prologue.

Notes
- The RUN line for measure_handle_qir.cpp drops the prototype's
  --lower-cc-measure-handle pass (does not exist under Option C).
- A loop-body CHECK uses {{.*}}!llvm.struct<"Result" instead of an
  exact -> !cc.ptr<!llvm.struct<"Result", opaque>> tail because the
  full type signature inside the loop has nested <...> brackets that
  trip FileCheck's greedy regex backtracking.

Spec: cudaq-spec/proposals/measure_handle.bs (Operational Semantics,
Kernel Signature Rule).

Made-with: Cursor
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 5, 2026
…nery, and QIR conversion

Consolidates the bridge-side, type-system, and QIR-conversion work for
the measure_handle PR stack. The runtime API arrived in the previous
commit; this commit makes the AST bridge produce !cc.measure_handle
SSA values, teaches the verifier to reject handles at the host-device
boundary, fills the byte-size and marshaling gaps, and patches the QIR
conversion so handle pointer/stdvec ops survive --convert-to-qir-api.

Type-system support
- include/cudaq/Optimizer/Dialect/CC/CCTypes.h, CCTypes.cpp:
  containsMeasureHandle (value-shape check) and
  containsMeasureHandleAtBoundary (recursive into callable signatures
  and bare function types). The boundary variant is required so
  `std::function<void(measure_handle)>` and `cudaq::qkernel<...>`
  parameters are caught at entry-point classification.
- lib/Optimizer/Dialect/CC/CCOps.cpp:
  MeasureHandleType case in getByteSizeOfType returning a constant
  8 bytes, the IR-mode width of a class with a single std::int64_t
  field. Without this, a pure-device kernel returning
  std::vector<measure_handle> aborts in ConvertStmt.cpp with
  "unhandled vector element type" because __nvqpp_vectorCopyCtor
  cannot get a constant element size for the heap-copy prologue.

AST bridge
- lib/Frontend/nvqpp/ConvertType.cpp, ConvertDecl.cpp:
  cudaq::measure_handle maps to !cc.measure_handle;
  std::vector<measure_handle> is recognised in measurement
  register-name handling.
- lib/Frontend/nvqpp/ConvertExpr.cpp: the central rewire.
  * mz / mx / my emit !cc.measure_handle (scalar) or
    !cc.stdvec<!cc.measure_handle> (range/variadic) directly.
  * CK_UserDefinedConversion at measure_handle::operator bool inserts
    quake.discriminate at every spec-mandated bool-coercion site.
  * The discriminate-insertion path runs an isBoundHandle check that
    walks through cc.compute_ptr / cc.cast to the base alloca and,
    on the scalar-handle alloca shape, requires that a binding store
    dominate the load (mlir::DominanceInfo, computed lazily once per
    coercion site). Conditional-store shapes that previously emitted
    a discriminate over an uninitialized i64 payload now diagnose.
  * cudaq::to_bools is intercepted by name and lowered to a vectorized
    quake.discriminate on the entire handle stdvec; it is the bulk
    counterpart to operator bool.
  * cudaq::to_integer rejects vector<measure_handle> with a
    spec-named diagnostic (per measure_handle.bs §C++ API): the
    silent auto-insert that hid the bulk-discrimination API is gone.
  * measure_handle copy/move construction and operator= are
    intercepted as value-typed aliasing of the sub-i64 stack value;
    chained `h3 = h2 = h;` works because the dispatch drops the
    callee value the visitor pushed.
  * default-construct produces only the storage slot (cc.alloca);
    VisitVarDecl binds it directly so any read at a discriminate
    site is statically diagnosed by the unbound-handle path.
- lib/Frontend/nvqpp/ASTBridge.cpp: __qpu__ entry-point classification
  rejects functor operator() shapes whose signature transitively
  mentions measure_handle, the only disambiguable spec violation at
  AST time.

Marshaling and QIR conversion
- lib/Optimizer/Builder/Marshal.cpp:
  hasLegalType extends the entry-point predicate to reject
  measure_handle alongside qubit-typed parameters/results.
  lookupHostEntryPointFunc early-returns for device-only kernels
  whose signature cannot cross the host boundary, so the host-side
  rewriter skips them entirely.
- lib/Optimizer/CodeGen/ConvertToQIRAPI.cpp:
  The TypeConverter rewrites !cc.measure_handle to i64, but
  cc.compute_ptr / cc.stdvec_data / cc.stdvec_init / cc.stdvec_size
  carrying handle pointer or stdvec types had no patterns and no
  dynamic-legality predicates, so the framework left them
  legal-by-default and inserted unrealized_conversion_casts that
  applyPartialConversion could not resolve. Add OpInterfacePattern
  instantiations and extend the legality predicate so all four ops
  participate in the same operand/result-type rewrite the existing
  CC pointer ops already use.

LLVM 22 idiom
- All 15 op-creation sites added by this commit in ConvertExpr.cpp
  use the LLVM 22 form Op::create(builder, loc, ...). Two of those
  sites are arith::ConstantIntOp poison-result fallbacks (the
  unbound-handle and to_integer-rejection paths) and additionally
  use the LLVM 22 (builder, loc, type, value) signature.

Tests, runtime helpers, and docs follow as the next commit in this
PR. The dialect type itself (PR NVIDIA#4403) and the QIR conversion's
TypeConverter entry for !cc.measure_handle (PR NVIDIA#4404) are already on
main.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
cketcham2333 pushed a commit to cketcham2333/cuda-quantum that referenced this pull request May 7, 2026
…VIDIA#4405)

## Summary
* Extend `--expand-measurements` to scalarize `quake.mz`/`mx`/`my` `%veq
-> !cc.stdvec<!cc.measure_handle>`.
* Builds on NVIDIA#4404
* No source-language or runtime API change.

## Motivation
The pass previously hardcoded `!quake.measure` for per-element output
and only handled `quake.discriminate` consumers of the vector result.
Handle-typed vector measurements require per-element
`!cc.measure_handle` output and can flow to non-discriminate consumers
(returns, stores, calls), neither of which the legacy
`vector<bool>`-only rewrite supported.

## What Changed
- `ExpandRewritePattern` tracks the input stdvec's element type and
emits per-element measurements of the matching type (`!quake.measure` or
`!cc.measure_handle`).
- Consumers are classified as discriminate vs non-discriminate. Handle
inputs allocate each buffer only when its consumer class is present;
legacy `!cc.stdvec<!quake.measure>` inputs always allocate the i1 buffer
so existing AST-Quake CHECK lines stay stable.
- Original op is replaced via `replaceOp` (atomic) instead of `eraseOp`,
so partial conversion does not try to re-legalize downstream
`func.return` consumers.
- New lit test `test/Transforms/expand_measurements_handle.qke` covers
handle stdvec with each consumer class (return-only, discriminate-only,
mixed, `cc.store`), mixed `ref + veq` operands, and `mx`/`my` parity.

---------

Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 7, 2026
…VIDIA#4405)

## Summary
* Extend `--expand-measurements` to scalarize `quake.mz`/`mx`/`my` `%veq
-> !cc.stdvec<!cc.measure_handle>`.
* Builds on NVIDIA#4404
* No source-language or runtime API change.

## Motivation
The pass previously hardcoded `!quake.measure` for per-element output
and only handled `quake.discriminate` consumers of the vector result.
Handle-typed vector measurements require per-element
`!cc.measure_handle` output and can flow to non-discriminate consumers
(returns, stores, calls), neither of which the legacy
`vector<bool>`-only rewrite supported.

## What Changed
- `ExpandRewritePattern` tracks the input stdvec's element type and
emits per-element measurements of the matching type (`!quake.measure` or
`!cc.measure_handle`).
- Consumers are classified as discriminate vs non-discriminate. Handle
inputs allocate each buffer only when its consumer class is present;
legacy `!cc.stdvec<!quake.measure>` inputs always allocate the i1 buffer
so existing AST-Quake CHECK lines stay stable.
- Original op is replaced via `replaceOp` (atomic) instead of `eraseOp`,
so partial conversion does not try to re-legalize downstream
`func.return` consumers.
- New lit test `test/Transforms/expand_measurements_handle.qke` covers
handle stdvec with each consumer class (return-only, discriminate-only,
mixed, `cc.store`), mixed `ref + veq` operands, and `mx`/`my` parity.

---------

Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 7, 2026
…nery, and QIR conversion

Consolidates the bridge-side, type-system, and QIR-conversion work for
the measure_handle PR stack. The runtime API arrived in the previous
commit; this commit makes the AST bridge produce !cc.measure_handle
SSA values, teaches the verifier to reject handles at the host-device
boundary, fills the byte-size and marshaling gaps, and patches the QIR
conversion so handle pointer/stdvec ops survive --convert-to-qir-api.

Type-system support
- include/cudaq/Optimizer/Dialect/CC/CCTypes.h, CCTypes.cpp:
  containsMeasureHandle (value-shape check) and
  containsMeasureHandleAtBoundary (recursive into callable signatures
  and bare function types). The boundary variant is required so
  `std::function<void(measure_handle)>` and `cudaq::qkernel<...>`
  parameters are caught at entry-point classification.
- lib/Optimizer/Dialect/CC/CCOps.cpp:
  MeasureHandleType case in getByteSizeOfType returning a constant
  8 bytes, the IR-mode width of a class with a single std::int64_t
  field. Without this, a pure-device kernel returning
  std::vector<measure_handle> aborts in ConvertStmt.cpp with
  "unhandled vector element type" because __nvqpp_vectorCopyCtor
  cannot get a constant element size for the heap-copy prologue.

AST bridge
- lib/Frontend/nvqpp/ConvertType.cpp, ConvertDecl.cpp:
  cudaq::measure_handle maps to !cc.measure_handle;
  std::vector<measure_handle> is recognised in measurement
  register-name handling.
- lib/Frontend/nvqpp/ConvertExpr.cpp: the central rewire.
  * mz / mx / my emit !cc.measure_handle (scalar) or
    !cc.stdvec<!cc.measure_handle> (range/variadic) directly.
  * CK_UserDefinedConversion at measure_handle::operator bool inserts
    quake.discriminate at every spec-mandated bool-coercion site.
  * The discriminate-insertion path runs an isBoundHandle check that
    walks through cc.compute_ptr / cc.cast to the base alloca and,
    on the scalar-handle alloca shape, requires that a binding store
    dominate the load (mlir::DominanceInfo, computed lazily once per
    coercion site). Conditional-store shapes that previously emitted
    a discriminate over an uninitialized i64 payload now diagnose.
  * cudaq::to_bools is intercepted by name and lowered to a vectorized
    quake.discriminate on the entire handle stdvec; it is the bulk
    counterpart to operator bool.
  * cudaq::to_integer rejects vector<measure_handle> with a
    spec-named diagnostic (per measure_handle.bs §C++ API): the
    silent auto-insert that hid the bulk-discrimination API is gone.
  * measure_handle copy/move construction and operator= are
    intercepted as value-typed aliasing of the sub-i64 stack value;
    chained `h3 = h2 = h;` works because the dispatch drops the
    callee value the visitor pushed.
  * default-construct produces only the storage slot (cc.alloca);
    VisitVarDecl binds it directly so any read at a discriminate
    site is statically diagnosed by the unbound-handle path.
- lib/Frontend/nvqpp/ASTBridge.cpp: __qpu__ entry-point classification
  rejects functor operator() shapes whose signature transitively
  mentions measure_handle, the only disambiguable spec violation at
  AST time.

Marshaling and QIR conversion
- lib/Optimizer/Builder/Marshal.cpp:
  hasLegalType extends the entry-point predicate to reject
  measure_handle alongside qubit-typed parameters/results.
  lookupHostEntryPointFunc early-returns for device-only kernels
  whose signature cannot cross the host boundary, so the host-side
  rewriter skips them entirely.
- lib/Optimizer/CodeGen/ConvertToQIRAPI.cpp:
  The TypeConverter rewrites !cc.measure_handle to i64, but
  cc.compute_ptr / cc.stdvec_data / cc.stdvec_init / cc.stdvec_size
  carrying handle pointer or stdvec types had no patterns and no
  dynamic-legality predicates, so the framework left them
  legal-by-default and inserted unrealized_conversion_casts that
  applyPartialConversion could not resolve. Add OpInterfacePattern
  instantiations and extend the legality predicate so all four ops
  participate in the same operand/result-type rewrite the existing
  CC pointer ops already use.

LLVM 22 idiom
- All 15 op-creation sites added by this commit in ConvertExpr.cpp
  use the LLVM 22 form Op::create(builder, loc, ...). Two of those
  sites are arith::ConstantIntOp poison-result fallbacks (the
  unbound-handle and to_integer-rejection paths) and additionally
  use the LLVM 22 (builder, loc, type, value) signature.

Tests, runtime helpers, and docs follow as the next commit in this
PR. The dialect type itself (PR NVIDIA#4403) and the QIR conversion's
TypeConverter entry for !cc.measure_handle (PR NVIDIA#4404) are already on
main.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
khalatepradnya added a commit to khalatepradnya/cuda-quantum that referenced this pull request May 11, 2026
…nery, and QIR conversion

Consolidates the bridge-side, type-system, and QIR-conversion work for
the measure_handle PR stack. The runtime API arrived in the previous
commit; this commit makes the AST bridge produce !cc.measure_handle
SSA values, teaches the verifier to reject handles at the host-device
boundary, fills the byte-size and marshaling gaps, and patches the QIR
conversion so handle pointer/stdvec ops survive --convert-to-qir-api.

Type-system support
- include/cudaq/Optimizer/Dialect/CC/CCTypes.h, CCTypes.cpp:
  containsMeasureHandle (value-shape check) and
  containsMeasureHandleAtBoundary (recursive into callable signatures
  and bare function types). The boundary variant is required so
  `std::function<void(measure_handle)>` and `cudaq::qkernel<...>`
  parameters are caught at entry-point classification.
- lib/Optimizer/Dialect/CC/CCOps.cpp:
  MeasureHandleType case in getByteSizeOfType returning a constant
  8 bytes, the IR-mode width of a class with a single std::int64_t
  field. Without this, a pure-device kernel returning
  std::vector<measure_handle> aborts in ConvertStmt.cpp with
  "unhandled vector element type" because __nvqpp_vectorCopyCtor
  cannot get a constant element size for the heap-copy prologue.

AST bridge
- lib/Frontend/nvqpp/ConvertType.cpp, ConvertDecl.cpp:
  cudaq::measure_handle maps to !cc.measure_handle;
  std::vector<measure_handle> is recognised in measurement
  register-name handling.
- lib/Frontend/nvqpp/ConvertExpr.cpp: the central rewire.
  * mz / mx / my emit !cc.measure_handle (scalar) or
    !cc.stdvec<!cc.measure_handle> (range/variadic) directly.
  * CK_UserDefinedConversion at measure_handle::operator bool inserts
    quake.discriminate at every spec-mandated bool-coercion site.
  * The discriminate-insertion path runs an isBoundHandle check that
    walks through cc.compute_ptr / cc.cast to the base alloca and,
    on the scalar-handle alloca shape, requires that a binding store
    dominate the load (mlir::DominanceInfo, computed lazily once per
    coercion site). Conditional-store shapes that previously emitted
    a discriminate over an uninitialized i64 payload now diagnose.
  * cudaq::to_bools is intercepted by name and lowered to a vectorized
    quake.discriminate on the entire handle stdvec; it is the bulk
    counterpart to operator bool.
  * cudaq::to_integer rejects vector<measure_handle> with a
    spec-named diagnostic (per measure_handle.bs §C++ API): the
    silent auto-insert that hid the bulk-discrimination API is gone.
  * measure_handle copy/move construction and operator= are
    intercepted as value-typed aliasing of the sub-i64 stack value;
    chained `h3 = h2 = h;` works because the dispatch drops the
    callee value the visitor pushed.
  * default-construct produces only the storage slot (cc.alloca);
    VisitVarDecl binds it directly so any read at a discriminate
    site is statically diagnosed by the unbound-handle path.
- lib/Frontend/nvqpp/ASTBridge.cpp: __qpu__ entry-point classification
  rejects functor operator() shapes whose signature transitively
  mentions measure_handle, the only disambiguable spec violation at
  AST time.

Marshaling and QIR conversion
- lib/Optimizer/Builder/Marshal.cpp:
  hasLegalType extends the entry-point predicate to reject
  measure_handle alongside qubit-typed parameters/results.
  lookupHostEntryPointFunc early-returns for device-only kernels
  whose signature cannot cross the host boundary, so the host-side
  rewriter skips them entirely.
- lib/Optimizer/CodeGen/ConvertToQIRAPI.cpp:
  The TypeConverter rewrites !cc.measure_handle to i64, but
  cc.compute_ptr / cc.stdvec_data / cc.stdvec_init / cc.stdvec_size
  carrying handle pointer or stdvec types had no patterns and no
  dynamic-legality predicates, so the framework left them
  legal-by-default and inserted unrealized_conversion_casts that
  applyPartialConversion could not resolve. Add OpInterfacePattern
  instantiations and extend the legality predicate so all four ops
  participate in the same operand/result-type rewrite the existing
  CC pointer ops already use.

LLVM 22 idiom
- All 15 op-creation sites added by this commit in ConvertExpr.cpp
  use the LLVM 22 form Op::create(builder, loc, ...). Two of those
  sites are arith::ConstantIntOp poison-result fallbacks (the
  unbound-handle and to_integer-rejection paths) and additionally
  use the LLVM 22 (builder, loc, type, value) signature.

Tests, runtime helpers, and docs follow as the next commit in this
PR. The dialect type itself (PR NVIDIA#4403) and the QIR conversion's
TypeConverter entry for !cc.measure_handle (PR NVIDIA#4404) are already on
main.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants