Audit model-specific fp32 protected tensors during dtype casting

## Problem

The shared dtype-casting path and several concrete fp32-protection bugs are already largely addressed by fixes on `main` and by NVIDIA-NeMo/Automodel#2484.

PR #2484 covers the Qwen GatedDeltaNet path by making `A_log` / `dt_bias` explicit fp32 tensors. It also covers the currently known opt/router correction-bias case for Nemotron 3 Super.

The remaining work is an audit task: some other model families may still have numerically sensitive parameters or buffers that should be explicitly marked to stay fp32 during model dtype casting.

## Scope

- Audit model-specific parameters and buffers that should remain fp32 under bf16/fp16 model casts.
- For each real case found, add the appropriate fp32 tracking marker, such as `_keep_in_fp32_modules` or `_keep_in_fp32_modules_strict`.
- Add focused unit coverage showing the sensitive tensor remains fp32 after `cast_model_to_dtype(...)`.
- Check both normal construction and relevant sharded/FSDP2 behavior when applicable.

## Notes

This issue is specifically for remaining model-by-model audit work after the main dtype fixes and NVIDIA-NeMo/Automodel#2484.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audit model-specific fp32 protected tensors during dtype casting #2570

Problem

Scope

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Audit model-specific fp32 protected tensors during dtype casting #2570

Description

Problem

Scope

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions