[graph_trainer] AutoParallel AOT FX Trace Backend Integration by sanketpurandare · Pull Request #2725 · pytorch/torchtitan

sanketpurandare · 2026-03-27T00:49:25Z

Stacked PRs:

[graph_trainer] AutoParallel AOT FX Trace Backend Integration

The goal is to make AutoParallel a first-class aot_fx_trace integration with
two backend modes:

Native GraphTrainer backend mode
- --compile.mode aot_fx_trace --compile.autoparallel
- AutoParallel places the model.
- GraphTrainer traces forward, loss, and backward with make_fx.
- GraphTrainer uses its own aot-fx-trace graph passes and compile path.
AutoParallel backend mode
- --compile.mode aot_fx_trace --compile.autoparallel
- --compile.inductor_compilation autoparallel_backend
- AutoParallel places the model.
- GraphTrainer still traces forward, loss, and backward with make_fx.
- GraphTrainer switches from its native pass stack to AutoParallel's backend
  policy helpers and full-Inductor compilation path.

The key design point: both modes share the same AutoParallel model placement and
the same GraphTrainer training-step tracing. They differ only in what pass and
backend policy is applied after GraphTrainer has the traced train-step graph.

The goal is to make AutoParallel a first-class `aot_fx_trace` integration with two backend modes: 1. **Native GraphTrainer backend mode** - `--compile.mode aot_fx_trace --compile.autoparallel` - AutoParallel places the model. - GraphTrainer traces forward, loss, and backward with `make_fx`. - GraphTrainer uses its own aot-fx-trace graph passes and compile path. 2. **AutoParallel backend mode** - `--compile.mode aot_fx_trace --compile.autoparallel` - `--compile.inductor_compilation autoparallel_backend` - AutoParallel places the model. - GraphTrainer still traces forward, loss, and backward with `make_fx`. - GraphTrainer switches from its native pass stack to AutoParallel's backend policy helpers and full-Inductor compilation path. The key design point: both modes share the same AutoParallel model placement and the same GraphTrainer training-step tracing. They differ only in what pass and backend policy is applied after GraphTrainer has the traced train-step graph. stack-info: PR: #2725, branch: sanketpurandare/stack/4

pytorch-bot Bot added the ciflow/8gpu label Mar 27, 2026

sanketpurandare force-pushed the sanketpurandare/stack/3 branch from 2d2fb54 to bd097f8 Compare March 27, 2026 00:49

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 9f9cc4c to b0f0cc3 Compare March 27, 2026 00:49

This was referenced Mar 27, 2026

Remove stale autoparallel/deepseek_v3 experiment #2271

Merged

Refactor pipeline parallel helpers for graph PP reuse #2724

Open

[graph_trainer][aot_fx_trace][graph_pp] graph_partition #2726

Draft

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 27, 2026

sanketpurandare mentioned this pull request Mar 27, 2026

[graph_trainer][aot_fx_trace][graph_pp] split_di_dw #2727

Draft

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main March 27, 2026 01:13

sanketpurandare requested review from fegin, tianyu-l, wconstab and wwwjn as code owners March 27, 2026 01:13

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from b0f0cc3 to cd1af1a Compare March 27, 2026 01:14

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 March 27, 2026 01:14

tianyu-l approved these changes Mar 27, 2026

View reviewed changes

Comment thread torchtitan/experiments/autoparallel/local_map_deepseek_v3/model.py Outdated

sanketpurandare marked this pull request as draft March 27, 2026 02:22

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main March 27, 2026 18:57

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from cd1af1a to 0d30cb5 Compare March 27, 2026 18:57

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 March 27, 2026 18:57

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main April 1, 2026 16:28

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 April 1, 2026 16:28

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main April 24, 2026 19:53

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 0d30cb5 to e00f2a7 Compare April 24, 2026 19:53

sanketpurandare changed the title ~~Fix DeepSeekV3Model for Configurable build pattern~~ Add DeepSeek V3 debugmodel_sdpa and 16B_sdpa config variants Apr 24, 2026

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 April 24, 2026 19:53

sanketpurandare marked this pull request as ready for review April 24, 2026 19:55

sanketpurandare requested review from SherlockNoMad, aditvenk, xmfan and yiming0416 as code owners April 24, 2026 19:55

yiming0416 reviewed Apr 24, 2026

View reviewed changes

Comment thread torchtitan/models/deepseek_v3/config_registry.py Outdated

SherlockNoMad approved these changes Apr 24, 2026

View reviewed changes

sanketpurandare marked this pull request as draft April 30, 2026 18:15

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main April 30, 2026 18:15

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from e00f2a7 to 89d4fa1 Compare April 30, 2026 18:15

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 April 30, 2026 18:16

sanketpurandare marked this pull request as ready for review April 30, 2026 18:16

sanketpurandare marked this pull request as draft April 30, 2026 18:41

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main April 30, 2026 18:41

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 89d4fa1 to 2c037b6 Compare April 30, 2026 18:41

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 April 30, 2026 18:41

sanketpurandare marked this pull request as ready for review April 30, 2026 18:41

tianyu-l requested changes Apr 30, 2026

View reviewed changes

Comment thread torchtitan/models/deepseek_v3/__init__.py Outdated

sanketpurandare marked this pull request as draft May 4, 2026 03:05

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main May 4, 2026 03:05

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 2c037b6 to 5adc1ea Compare May 4, 2026 03:05

sanketpurandare changed the title ~~Add DeepSeek V3 debugmodel_sdpa and 16B_sdpa config variants~~ [graph_trainer] AutoParallel AOT FX Trace Backend Integration May 4, 2026

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 May 4, 2026 03:05

sanketpurandare marked this pull request as ready for review May 4, 2026 03:06

sanketpurandare marked this pull request as draft May 4, 2026 04:38

sanketpurandare changed the base branch from sanketpurandare/stack/3 to main May 4, 2026 04:38

sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 5adc1ea to 28d5181 Compare May 4, 2026 04:38

sanketpurandare changed the base branch from main to sanketpurandare/stack/3 May 4, 2026 04:38

sanketpurandare marked this pull request as ready for review May 4, 2026 04:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[graph_trainer] AutoParallel AOT FX Trace Backend Integration#2725

[graph_trainer] AutoParallel AOT FX Trace Backend Integration#2725
sanketpurandare wants to merge 1 commit intosanketpurandare/stack/3from
sanketpurandare/stack/4

sanketpurandare commented Mar 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sanketpurandare commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!