-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix invalid academic example script placeholders
community-request
#5410
opened Jun 20, 2026 by
fallintoplace
Loading…
M-LM - M-Bridge Data consolidation
Run functional tests
#5409
opened Jun 20, 2026 by
asolergi-nv
Contributor
•
Draft
1 of 6 tasks
fix: Harden Claude GitHub workflows
complexity: medium
#5408
opened Jun 20, 2026 by
chtruong814
Contributor
Loading…
debug: Test Claude Review
community-request
Final Review
PR is in the "final review" stage
#5407
opened Jun 20, 2026 by
CharlieTruong
Loading…
6 tasks
Fix merges_file kwarg name in HuggingFaceTokenizer
community-request
#5406
opened Jun 20, 2026 by
muyihao
Loading…
3 of 6 tasks
Consistent oncall schedule
complexity: low
#5404
opened Jun 18, 2026 by
Phlip79
Member
Loading…
1 task done
Rename CP batch helpers to describe balancing granularity
complexity: low
Final Review
PR is in the "final review" stage
#5403
opened Jun 18, 2026 by
deepakn94
Contributor
Loading…
1 task done
chore: nightly sync main into dev (18_06_2026)
complexity: high
Run functional tests
Run MBridge tests
Attach this for testing this PR against MBridge main
#5402
opened Jun 18, 2026 by
svcnvidia-nemo-ci
Loading…
[Fix] Fix MoE router z-loss compatibility with TE CUDA Graph capture.
community-request
#5401
opened Jun 18, 2026 by
Baibaifan
Loading…
fix(optimizer): route GatedDeltaNet in_proj to Adam instead of orthogonalizing it (Muon)
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#5400
opened Jun 18, 2026 by
yuchenwang3
Loading…
Fix fast-cache-load rank synchronization guard
community-request
waiting-on-customer
Waiting on the original author to respond
#5398
opened Jun 18, 2026 by
sandyhouse
Loading…
1 task
Add RADIO vision encoder wrapper for MIMO example
complexity: medium
#5397
opened Jun 17, 2026 by
yashaswikarnati
Contributor
Loading…
perf(gated_delta_net): fold q/k L2-norm into the gated_delta_rule kernel
community-request
#5396
opened Jun 17, 2026 by
yuchenwang3
•
Draft
fix(optimizer): skip grad-norm clipping for orthogonalizing (Muon) optimizers
community-request
#5395
opened Jun 17, 2026 by
yuchenwang3
Loading…
fix: skip permute kernel launch when valid_tokens is zero (closes #4660)
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#5393
opened Jun 17, 2026 by
botbikamordehai2-sketch
Loading…
[main] moe(perf): Refactor GDN A2A helper flow
complexity: medium
#5392
opened Jun 17, 2026 by
yuzhongw-nvidia
Contributor
Loading…
1 of 6 tasks
[dev] Add experimental decoupled compact LayerWise DDP layout for Muon
complexity: medium
#5388
opened Jun 17, 2026 by
Wohox
Contributor
Loading…
3 of 6 tasks
Add experimental Megatron-FSDP fully_shard implementation
complexity: medium
Final Review
PR is in the "final review" stage
MFSDPv2
Run tests
#5387
opened Jun 17, 2026 by
wujingyue
Contributor
Loading…
Add DSA/DSv4 Indexer Replay for RL training stability
community-request
waiting-on-maintainers
Waiting on maintainers to respond
#5386
opened Jun 17, 2026 by
ParamThakkar123
Loading…
route collectives through torchcomms
community-request
#5385
opened Jun 16, 2026 by
tushar00jain
•
Draft
Fix fused MLA down projection with tensor parallelism
complexity: low
Final Review
PR is in the "final review" stage
#5383
opened Jun 16, 2026 by
sraman-rgb
Contributor
Loading…
6 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.