-
Notifications
You must be signed in to change notification settings - Fork 185
Pull requests: NVIDIA-NeMo/Automodel
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
cp: fix(model) (#2657) to r0.5.0
cherry-pick
Run CICD
Trigger Testing CICD
#2671
opened Jun 20, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
feat(nemotron_v3): support dense Nemotron-H (Nano 4B)
community-request
#2670
opened Jun 20, 2026 by
stanley1208
Contributor
•
Draft
3 tasks done
cp: Trigger Testing CICD
fix(moe): preserve fp32 A_log in Qwen3.5-{MoE,Next GatedDeltaNet} (2484) into r0.5.0
cherry-pick
Run CICD
#2664
opened Jun 20, 2026 by
akoumpa
Contributor
Loading…
fix(vlm): guard validation forward against cuDNN fused-MHA SDPA backend
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
fix(deepseek-v4): avoid bf16 -inf overflow in additive attention mask
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2658
opened Jun 20, 2026 by
akoumpa
Contributor
Loading…
fix(qwen3_5): route dense MTP through SDPA + block-causal mask for pack
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2656
opened Jun 20, 2026 by
akoumpa
Contributor
Loading…
fix(deepseek-v4): restore batch axis for packed-sequence (THD) forward
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2651
opened Jun 20, 2026 by
akoumpa
Contributor
Loading…
Adding support to training Ministral 3 as embedding model for visual document retrieval
#2648
opened Jun 19, 2026 by
gabrielspmoreira
•
Draft
3 tasks
feat(glm_moe_dsa): GLM-5.2 IndexShare DSA support
#2633
opened Jun 18, 2026 by
HuiyingLi
Contributor
Loading…
ci: Update transformers to latest version 5.12.1
#2632
opened Jun 18, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
fix(checkpoint): write consolidated safetensors without append
community-request
#2627
opened Jun 18, 2026 by
huahuajhu
Loading…
3 tasks done
fix(checkpoint): super-49B consolidated reload and vllm_deploy
#2626
opened Jun 17, 2026 by
adil-a
Collaborator
Loading…
fix(models): use bool sparse masks for sdpa
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2624
opened Jun 17, 2026 by
yuhezhang-ai
Contributor
Loading…
feat(magi): honor AttnMaskSpec on the HF attention backend
#2622
opened Jun 17, 2026 by
HuiyingLi
Contributor
Loading…
fix(loss): support THD/packed layout in FusedLinearCrossEntropy
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2615
opened Jun 17, 2026 by
akoumpa
Contributor
Loading…
fix(gemma4_moe): re-tie lm_head to active embed_tokens on MoE path
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2601
opened Jun 16, 2026 by
Achyuthan-S
Loading…
fix(models): audit fp32 protected tensors
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2598
opened Jun 16, 2026 by
yuhezhang-ai
Contributor
Loading…
feat(retrieval): vl retrieval resolved dataset
#2596
opened Jun 16, 2026 by
yuhezhang-ai
Contributor
•
Draft
feat(dflash): add dpace loss
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2572
opened Jun 15, 2026 by
kashif
Contributor
Loading…
ci: Update transformers to latest version 5.12.0
#2555
opened Jun 14, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
feat: CP support for MiniMax M3
#2551
opened Jun 13, 2026 by
athitten
Contributor
Loading…
2 of 3 tasks
feat(moe): mxfp4-resident MoE experts for DeepSeek-V4-Flash LoRA
community-request
waiting-on-customer
Waiting on the original author to respond
#2548
opened Jun 12, 2026 by
excepshenal
Loading…
3 tasks done
fix(wandb): log different val datasets separately in wandb
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#2526
opened Jun 11, 2026 by
grgkovac
Contributor
Loading…
3 tasks done
Previous Next
ProTip!
Adding no:label will show everything without a label.