Skip to content

Pull requests: NVIDIA-NeMo/Automodel

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

cp: fix(model) (#2657) to r0.5.0 cherry-pick Run CICD Trigger Testing CICD
#2671 opened Jun 20, 2026 by svcnvidia-nemo-ci Contributor Loading…
fix(vlm): guard validation forward against cuDNN fused-MHA SDPA backend r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2659 opened Jun 20, 2026 by akoumpa Contributor Draft
fix(deepseek-v4): avoid bf16 -inf overflow in additive attention mask r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2658 opened Jun 20, 2026 by akoumpa Contributor Loading…
fix(qwen3_5): route dense MTP through SDPA + block-causal mask for pack r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2656 opened Jun 20, 2026 by akoumpa Contributor Loading…
fix(deepseek-v4): restore batch axis for packed-sequence (THD) forward r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2651 opened Jun 20, 2026 by akoumpa Contributor Loading…
feat(glm_moe_dsa): GLM-5.2 IndexShare DSA support
#2633 opened Jun 18, 2026 by HuiyingLi Contributor Loading…
ci: Update transformers to latest version 5.12.1
#2632 opened Jun 18, 2026 by svcnvidia-nemo-ci Contributor Loading…
fix(checkpoint): super-49B consolidated reload and vllm_deploy
#2626 opened Jun 17, 2026 by adil-a Collaborator Loading…
fix(models): use bool sparse masks for sdpa r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2624 opened Jun 17, 2026 by yuhezhang-ai Contributor Loading…
feat(magi): honor AttnMaskSpec on the HF attention backend
#2622 opened Jun 17, 2026 by HuiyingLi Contributor Loading…
fix(loss): support THD/packed layout in FusedLinearCrossEntropy r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2615 opened Jun 17, 2026 by akoumpa Contributor Loading…
fix(models): audit fp32 protected tensors r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2598 opened Jun 16, 2026 by yuhezhang-ai Contributor Loading…
feat(retrieval): vl retrieval resolved dataset
#2596 opened Jun 16, 2026 by yuhezhang-ai Contributor Draft
feat(dflash): add dpace loss community-request waiting-on-maintainers Waiting on maintainers to respond
#2572 opened Jun 15, 2026 by kashif Contributor Loading…
feat(engine): Engine training API
#2556 opened Jun 14, 2026 by HuiyingLi Contributor Draft
ci: Update transformers to latest version 5.12.0
#2555 opened Jun 14, 2026 by svcnvidia-nemo-ci Contributor Loading…
feat: CP support for MiniMax M3
#2551 opened Jun 13, 2026 by athitten Contributor Loading…
2 of 3 tasks
feat(moe): mxfp4-resident MoE experts for DeepSeek-V4-Flash LoRA community-request waiting-on-customer Waiting on the original author to respond
#2548 opened Jun 12, 2026 by excepshenal Loading…
3 tasks done
fix(wandb): log different val datasets separately in wandb community-request waiting-on-maintainers Waiting on maintainers to respond
#2526 opened Jun 11, 2026 by grgkovac Contributor Loading…
3 tasks done
ProTip! Adding no:label will show everything without a label.