Summary
AsyncRedisSaver (and the sync RedisSaver) accept checkpoint_prefix / checkpoint_write_prefix so multiple deployments can share a Redis Stack instance without key collisions. JSON document keys ({prefix}:lg:checkpoint:..., {prefix}:lg:checkpoint_write:...) honor those prefixes correctly. But the "latest pointer" keys used by aget_tuple (when no checkpoint_id is supplied) are built as bare strings that ignore the configured prefix:
latest_pointer_key = f"checkpoint_latest:{storage_safe_thread_id}:{storage_safe_checkpoint_ns}"
This makes prefixing effectively useless for the "fetch latest checkpoint" path that the LangGraph runtime uses on every Pregel start.
Affected sites (v0.4.1, also present on main)
langgraph/checkpoint/redis/aio.py:
- L410 — read in
aget_tuple
- L1010 — write in
aput
- L1996 — write in
aput's cancelled-fallback
- L2179 — delete in
aprune
- L2046–2048 / L2186–2189 —
pipeline.delete(latest_pointer_key) in adelete_thread / aprune
langgraph/checkpoint/redis/__init__.py (sync variants): L580 / L758 / L1602 / L1788.
The SET site in aput is unmistakable:
latest_pointer_key = f\"checkpoint_latest:{storage_safe_thread_id}:{storage_safe_checkpoint_ns}\"
await self._redis.set(latest_pointer_key, checkpoint_key)
— no prefix folded in.
Impact
On a Redis Stack hosting multiple deployments (e.g. staging + prod, or multi-tenant), every saver instance reads / writes the same global checkpoint_latest:{thread}:{ns} keyspace.
- Cross-deployment overwrite. If two deployments use overlapping
thread_id shapes (and the LangGraph runtime gives full control of thread_id to the application — UUIDs help but don't prevent the failure mode), the last writer's pointer wins.
- Silent decode failure. Once a pointer resolves to a doc key under the other env's prefix, the saver's pipeline
JSON.GET checkpoint_key returns None (the doc exists, just under a different prefix). aget_tuple falls through to return None — the saver reports "no checkpoints exist" and the conversation silently forgets its history.
- No log signal. The failure looks identical to a fresh thread, so the issue is hard to attribute. We initially mistook this for a Pregel bug.
We hit (2) in production after migrating an environment onto a Redis Stack that the other env was already using.
Reproduction
Local Redis Stack (docker run --rm -d -p 6379:6379 redis/redis-stack:latest), langgraph-checkpoint-redis==0.4.1:
import asyncio
from redis.asyncio import Redis
from langgraph.checkpoint.redis.aio import AsyncRedisSaver
from langgraph.checkpoint.base import empty_checkpoint
async def main() -> None:
admin = Redis.from_url(\"redis://localhost:6379\", decode_responses=False)
await admin.flushdb()
# Two deployments sharing one Redis, each with its own prefix.
saver_a = AsyncRedisSaver(
redis_client=Redis.from_url(\"redis://localhost:6379\", decode_responses=False),
checkpoint_prefix=\"env-a:lg:checkpoint\",
checkpoint_write_prefix=\"env-a:lg:checkpoint_write\",
)
saver_b = AsyncRedisSaver(
redis_client=Redis.from_url(\"redis://localhost:6379\", decode_responses=False),
checkpoint_prefix=\"env-b:lg:checkpoint\",
checkpoint_write_prefix=\"env-b:lg:checkpoint_write\",
)
await saver_a.asetup()
await saver_b.asetup()
cfg = {\"configurable\": {\"thread_id\": \"t1\", \"checkpoint_ns\": \"\"}}
cp_a = empty_checkpoint(); cp_a[\"channel_values\"] = {\"owner\": \"A\"}
cp_b = empty_checkpoint(); cp_b[\"channel_values\"] = {\"owner\": \"B\"}
await saver_a.aput(cfg, cp_a, {}, {})
await saver_b.aput(cfg, cp_b, {}, {}) # overwrites A's bare pointer
print(\"A keys:\", sorted(k async for k in admin.scan_iter(match=\"env-a:*\")))
print(\"B keys:\", sorted(k async for k in admin.scan_iter(match=\"env-b:*\")))
print(\"bare pointer:\", await admin.get(b\"checkpoint_latest:t1:__empty__\"))
tup_a = await saver_a.aget_tuple(cfg)
print(\"A read:\", tup_a) # ← None — A's checkpoint is silently dropped
asyncio.run(main())
Output:
A keys: [b'env-a:lg:checkpoint:t1:__empty__:...'] # A's doc still exists under env-a prefix
B keys: [b'env-b:lg:checkpoint:t1:__empty__:...'] # B's doc under env-b prefix
bare pointer: b'env-b:lg:checkpoint:t1:__empty__:...' # global pointer now points at B's doc
A read: None # A's aget_tuple can't decode env-b's doc
Workaround
We shipped a wrapper that proxies the Redis client (and its pipeline()) passed into the saver. It intercepts the four ops the library does against checkpoint_latest:* (GET / SET / EXPIRE / DELETE) and rewrites the keys to live under our deployment's master prefix. A read-time fallback honours the legacy bare key only when the doc it points at lives under our env, so pre-deploy active threads continue working without leaking cross-env data.
This works but it's ~120 lines of proxy code + 19 unit tests + a real-Redis smoke suite + a one-shot migration script we'd rather not maintain against future versions of this library. It also broke once because our initial proxy missed __aenter__ / __aexit__ on the pipeline wrapper, which redisvl/index/storage.py:awrite requires — so this workaround is fragile in ways checkpoint_prefix users probably don't expect to need to know.
Suggested fix
Either:
-
Apply checkpoint_prefix to the latest-pointer keys too. The minimal change is to extract a small helper
def _make_latest_pointer_key(self, thread_id: str, ns: str) -> str:
return f\"{self.checkpoint_prefix}:latest:{thread_id}:{ns}\"
# or any shape that's documented and namespace-scoped
and call it from the five sites that currently inline the f-string. Existing deployments will need a one-shot migration of bare → prefixed pointers — easy to include as a migrate_latest_pointers() utility on the saver that scans checkpoint_latest:* once and rewrites each.
-
Or, document that checkpoint_prefix is not sufficient to isolate multiple savers on a shared Redis Stack and recommend separate Redis databases (db=N) or separate instances per deployment. This is the cheaper docs-only fix but it's a footgun for anyone discovering checkpoint_prefix and assuming it does what it appears to.
Option 1 is what we'd prefer (the kwarg's existence implies isolation). Happy to send a PR if there's interest — let us know if you'd want it as a single bump (with migration helper) or split.
Environment
langgraph-checkpoint-redis==0.4.1 (latest tag as of 2026-05-19), verified on main
langgraph-checkpoint 2.x
redis-py 5.x, redisvl 0.x
- Redis Stack 7.x
- Python 3.11 (prod container) and 3.14 (dev)
Summary
AsyncRedisSaver(and the syncRedisSaver) acceptcheckpoint_prefix/checkpoint_write_prefixso multiple deployments can share a Redis Stack instance without key collisions. JSON document keys ({prefix}:lg:checkpoint:...,{prefix}:lg:checkpoint_write:...) honor those prefixes correctly. But the "latest pointer" keys used byaget_tuple(when nocheckpoint_idis supplied) are built as bare strings that ignore the configured prefix:This makes prefixing effectively useless for the "fetch latest checkpoint" path that the LangGraph runtime uses on every Pregel start.
Affected sites (v0.4.1, also present on
main)langgraph/checkpoint/redis/aio.py:aget_tupleaputaput's cancelled-fallbackaprunepipeline.delete(latest_pointer_key)inadelete_thread/aprunelanggraph/checkpoint/redis/__init__.py(sync variants): L580 / L758 / L1602 / L1788.The SET site in
aputis unmistakable:— no prefix folded in.
Impact
On a Redis Stack hosting multiple deployments (e.g. staging + prod, or multi-tenant), every saver instance reads / writes the same global
checkpoint_latest:{thread}:{ns}keyspace.thread_idshapes (and the LangGraph runtime gives full control of thread_id to the application — UUIDs help but don't prevent the failure mode), the last writer's pointer wins.JSON.GET checkpoint_keyreturnsNone(the doc exists, just under a different prefix).aget_tuplefalls through toreturn None— the saver reports "no checkpoints exist" and the conversation silently forgets its history.We hit (2) in production after migrating an environment onto a Redis Stack that the other env was already using.
Reproduction
Local Redis Stack (
docker run --rm -d -p 6379:6379 redis/redis-stack:latest),langgraph-checkpoint-redis==0.4.1:Output:
Workaround
We shipped a wrapper that proxies the Redis client (and its
pipeline()) passed into the saver. It intercepts the four ops the library does againstcheckpoint_latest:*(GET / SET / EXPIRE / DELETE) and rewrites the keys to live under our deployment's master prefix. A read-time fallback honours the legacy bare key only when the doc it points at lives under our env, so pre-deploy active threads continue working without leaking cross-env data.This works but it's ~120 lines of proxy code + 19 unit tests + a real-Redis smoke suite + a one-shot migration script we'd rather not maintain against future versions of this library. It also broke once because our initial proxy missed
__aenter__/__aexit__on the pipeline wrapper, whichredisvl/index/storage.py:awriterequires — so this workaround is fragile in wayscheckpoint_prefixusers probably don't expect to need to know.Suggested fix
Either:
Apply
checkpoint_prefixto the latest-pointer keys too. The minimal change is to extract a small helperand call it from the five sites that currently inline the f-string. Existing deployments will need a one-shot migration of bare → prefixed pointers — easy to include as a
migrate_latest_pointers()utility on the saver that scanscheckpoint_latest:*once and rewrites each.Or, document that
checkpoint_prefixis not sufficient to isolate multiple savers on a shared Redis Stack and recommend separate Redis databases (db=N) or separate instances per deployment. This is the cheaper docs-only fix but it's a footgun for anyone discoveringcheckpoint_prefixand assuming it does what it appears to.Option 1 is what we'd prefer (the kwarg's existence implies isolation). Happy to send a PR if there's interest — let us know if you'd want it as a single bump (with migration helper) or split.
Environment
langgraph-checkpoint-redis==0.4.1(latest tag as of 2026-05-19), verified onmainlanggraph-checkpoint2.xredis-py5.x,redisvl0.x