perf(throw_error): cheaper error construction and hook installation by sabify · Pull Request #4777 · leptos-rs/leptos

sabify · 2026-06-11T14:32:35Z

Two small optimizations to throw_error hot paths, with no API breakage or behavior change.

1. Single-allocation `Error::new` for concrete error types

The only way to construct an Error was the blanket From impl, which boxes the value into a Box<dyn Error> and then re-allocates it behind an Arc (the refcount header prevents reusing the box's allocation) — two allocations plus a memcpy per error.

This adds an additive inherent constructor that places a concrete error directly into the Arc in one allocation (~1.7x faster in isolated benches), and routes the two call sites that already hold a concrete type through it:

server_fn: From<ServerFnError> for Error (runs on every server-fn error conversion)
hydration_context: serialized error deserialization during hydration

The &str/String/anyhow paths keep using From (they need the box anyway).

2. Swap instead of clone in `ErrorHookFuture::poll`

ErrorHookFuture::poll cloned its captured hook Arc on every poll — two atomic refcount ops per poll on a future that wraps every Suspense body and is polled once per await-point resumption.

Since the future only needs to lend the hook to the thread-local slot for the duration of the poll, it now mem::swaps it in and back out via a panic-safe drop guard: zero atomic ops per poll (~4.1ns → ~1.7ns per poll in isolated benches). Observable semantics are unchanged; a regression test asserts the captured hook overrides the ambient hook during the poll and the ambient hook is restored afterwards.

…crete sites Error construction on the throw path performed two heap allocations and a memcpy where one allocation suffices. Problem ------- The only public constructor is the blanket conversion: impl<T> From<T> for Error where T: Into<Box<dyn error::Error + Send + Sync + 'static>>, { fn from(value: T) -> Self { Error(Arc::from(value.into())) } } For a concrete, sized error, `value.into()` first materialises a `Box<dyn Error + Send + Sync>` (allocation #1). `Arc::from(Box<T>)` for an unsized payload cannot reuse the box's allocation, because the `Arc` must prepend its strong/weak refcount header in front of the value: it allocates a fresh, larger block (allocation #2), memcpys the error out of the box, and frees the box. Net cost: 2 allocations + 1 free + 1 memcpy. `From` itself cannot be narrowed to `T: error::Error` without breaking `Error::from("str")` / `Error::from(String)` (neither implements `std::error::Error`); the two bounds overlap, so a second `From` impl would violate coherence. Profiling --------- Isolated bench, mean of 2 runs: fn bench<F: FnMut()>(name: &str, iters: u64, mut f: F) { for _ in 0..(iters / 10).max(1) { f(); } // warmup let t = Instant::now(); for _ in 0..iters { f(); } println!("{name:<44} {:>9.3} ns/op", t.elapsed().as_nanos() as f64 / iters as f64); } bench("Error::from(DummyErr) [Box->Arc realloc]", 5_000_000, || { black_box(&Error::from(black_box(DummyErr { _pad: [0; 32] }))); }); bench("Error::new(DummyErr) [single alloc]", 5_000_000, || { black_box(&Error::new(black_box(DummyErr { _pad: [0; 32] }))); }); Error::from(DummyErr) [Box->Arc realloc] ~35.3 ns/op Error::new(DummyErr) [single alloc] ~20.5 ns/op (~1.7x faster) The 2x-ish ratio is the fingerprint of the eliminated second allocation: if `Arc::from(Box)` reused the box's memory the two would land together. Solution -------- Add an additive, non-breaking inherent constructor that places a concrete error directly into the `Arc` (one allocation, no copy): pub fn new<E>(err: E) -> Self where E: error::Error + Send + Sync + 'static, { Error(Arc::new(err)) // Arc<E> coerces to Arc<dyn Error> in place } Route the hot construction sites that already hold a concrete error type through it: - server_fn `From<ServerFnError> for Error`: `ServerFnErrorWrapper` derives `std::error::Error`, so it can skip the box. Runs on every server-fn error conversion. - hydration_context error deserialization: `SerializedError` is a concrete `std::error::Error`. Runs per serialized error during hydration. The `&str` / `String` / `anyhow` paths keep using `From` (they are unsized on arrival and need the box anyway).

…:poll `ErrorHookFuture::poll` cloned the captured hook `Arc` on every single poll. Problem ------- The wrapper installs its captured hook into the thread-local slot for the duration of each poll: fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> { let this = self.project(); let _hook = ResetErrorHookOnDrop(ERROR_HOOK.with_borrow_mut(|cur| { std::mem::replace(cur, this.hook.clone()) // atomic inc every poll })); this.inner.poll(cx) // ...atomic dec on drop } Every poll clones `this.hook` (`Option<Arc<dyn ErrorHook>>`): one atomic increment to install it, then one atomic decrement when the swapped-in clone is dropped through the guard. That is two atomic RMW operations per poll even though the hook being installed is the value the future already owns in its own field — it only needs to lend it, not own a second copy. `leptos` wraps each `Suspense` body future in `ErrorHookFuture`, which is polled once per await-point resumption for the life of the suspense, so this is a genuinely repeated path. Under cross-thread sharing the atomics also risk cache-line contention beyond the single-thread cost. Profiling --------- opt-level=3 + lto, mean of 2 runs. `CloneWrapper`/`SwapWrapper` mirror the old and new poll bodies over a thread-local slot to isolate the atomic delta: // clone (old): let prev = SLOT.with_borrow_mut(|c| mem::replace(c, this.hook.clone())); let r = Pin::new(&mut this.inner).poll(cx); SLOT.with_borrow_mut(|c| *c = prev); // swap (new): SLOT.with_borrow_mut(|c| mem::swap(c, &mut this.hook)); let r = Pin::new(&mut this.inner).poll(cx); SLOT.with_borrow_mut(|c| mem::swap(c, &mut this.hook)); poll bare future [baseline] ~0.40 ns/op poll wrapper [clone hook] ~4.11 ns/op poll wrapper [swap hook] ~0.60 ns/op (~3.5 ns saved) poll ErrorHookFuture [real, swap] ~1.65 ns/op (clone was ~4.13) Solution -------- Lend the future's own hook to the slot with a swap (no clone) and swap it back with a panic-safe guard, so zero atomic refcount ops run per poll: ERROR_HOOK.with_borrow_mut(|cur| std::mem::swap(cur, this.hook)); struct Restore<'a>(&'a mut Option<Arc<dyn ErrorHook>>); impl Drop for Restore<'_> { fn drop(&mut self) { let _ = ERROR_HOOK.try_with(|cell| { if let Ok(mut cur) = cell.try_borrow_mut() { std::mem::swap(&mut *cur, self.0); } }); } } let _restore = Restore(this.hook); this.inner.poll(cx) `this.hook` and `this.inner` are disjoint projected fields, so holding `&mut this.hook` in the guard while polling `inner` is sound. Observable semantics are unchanged: during the poll the slot holds the captured hook (including the `None` case, which correctly clears the slot); after the poll the slot holds the previous ambient hook and `this.hook` is restored — exactly as the clone version, minus the atomics. The guard mirrors `ResetErrorHookOnDrop`'s best-effort, panic-safe restore. Adds a regression test asserting the captured hook overrides the ambient hook during the poll and is restored afterwards.

gbj · 2026-06-12T13:21:48Z

9cddc6c seems like micro-optimization churn. It's talking about saving nanoseconds in the context of doing async web request/response work...

sabify · 2026-06-13T14:45:49Z

Yes, a 2.41x faster optimization in hot path + panic-safe drop guard.

sabify added 2 commits June 7, 2026 13:16

sabify changed the title ~~## perf(throw_error): cheaper error construction and hook installation~~ perf(throw_error): cheaper error construction and hook installation Jun 11, 2026

gbj reviewed Jun 12, 2026

View reviewed changes

Comment thread any_error/src/lib.rs Outdated

fix(docs): concise doc message for Error::new in throw_error

a75e23a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(throw_error): cheaper error construction and hook installation#4777

perf(throw_error): cheaper error construction and hook installation#4777
sabify wants to merge 3 commits into
leptos-rs:leptos_0.9from
sabify:perf/any_error-optimizations

sabify commented Jun 11, 2026

Uh oh!

Uh oh!

gbj commented Jun 12, 2026

Uh oh!

sabify commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sabify commented Jun 11, 2026

1. Single-allocation Error::new for concrete error types

2. Swap instead of clone in ErrorHookFuture::poll

Uh oh!

Uh oh!

gbj commented Jun 12, 2026

Uh oh!

sabify commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Single-allocation `Error::new` for concrete error types

2. Swap instead of clone in `ErrorHookFuture::poll`