[js/web] Forward WebGPU EP buffer cache mode options from JS#29017
[js/web] Forward WebGPU EP buffer cache mode options from JS#29017ssam18 wants to merge 2 commits into
Conversation
The native WebGPU EP already understands the storage, uniform, query resolve and default buffer cache mode options, but onnxruntime-web never forwarded them from executionProviders, so web users could not configure them. This adds the four fields to WebGpuExecutionProviderOption and passes them through to the EP like the existing validationMode option, with the same value validation the native side performs. For static shape models, setting storageBufferCacheMode to simple lets exact size buffers be reused across runs, which can cut peak GPU memory noticeably compared to the default bucket mode. Fixes microsoft#29016
|
Thanks for the quick fix. One concern: should The native side declares/parses It is parsed here together with the other buffer cache modes: onnxruntime/onnxruntime/core/providers/webgpu/webgpu_provider_factory.cc Lines 243 to 247 in cf509d8 But when the main WebGPU onnxruntime/onnxruntime/core/providers/webgpu/webgpu_context.cc Lines 148 to 151 in cf509d8
onnxruntime/onnxruntime/core/providers/webgpu/buffer_manager.cc Lines 490 to 495 in cf509d8 The factory also has no fourth/default cache mode parameter: onnxruntime/onnxruntime/core/providers/webgpu/buffer_manager.cc Lines 635 to 637 in cf509d8 So exposing |
The JS forwarding side of this PR already exposed defaultBufferCacheMode but the native BufferManager hardcoded its default_cache_ to Disabled, so the option had no effect end to end. Extend BufferManager and BufferManagerFactory::Create to take a fourth mode, plumb config.buffer_cache_config.default_entry.mode from webgpu_context into the main BufferManager, and preserve current behavior for the initializer and per-graph managers by passing Disabled there. Addresses popelenkow review comment on microsoft#29017. Signed-off-by: Samaresh Kumar Singh <ssam3003@gmail.com>
Description
The native WebGPU EP already supports the buffer cache mode options (
ep.webgpuexecutionprovider.storageBufferCacheModeand friends), but onnxruntime-web never forwarded them fromexecutionProviders, so they were unreachable from JS. This addsstorageBufferCacheMode,uniformBufferCacheMode,queryResolveBufferCacheModeanddefaultBufferCacheModetoWebGpuExecutionProviderOptionand forwards them to the EP the same wayvalidationModeis forwarded today, with the values validated against the set the native side accepts. The options ride the existingSessionOptionsAppendExecutionProviderpath, which prefixes each key into exactly the config entry the EP reads, so no native changes are needed.Motivation and Context
Fixes #29016. For static shape models,
storageBufferCacheMode: 'simple'reuses exact size buffers across runs instead of allocating new bucket sized ones, which the issue's repro shows cutting peak WebGPU memory by about 27 percent. Verified locally with tsc builds of js/common and js/web, prettier and eslint, the js/common unit tests, and type level checks that the new options compile and invalid values are rejected.