checkpoint_latest:* pointer keys ignore checkpoint_prefix, colliding across savers on a shared Redis Stack

## Summary

`AsyncRedisSaver` (and the sync `RedisSaver`) accept `checkpoint_prefix` / `checkpoint_write_prefix` so multiple deployments can share a Redis Stack instance without key collisions. JSON document keys (`{prefix}:lg:checkpoint:...`, `{prefix}:lg:checkpoint_write:...`) honor those prefixes correctly. **But the "latest pointer" keys used by `aget_tuple` (when no `checkpoint_id` is supplied) are built as bare strings that ignore the configured prefix:**

```python
latest_pointer_key = f"checkpoint_latest:{storage_safe_thread_id}:{storage_safe_checkpoint_ns}"
```

This makes prefixing effectively useless for the "fetch latest checkpoint" path that the LangGraph runtime uses on every Pregel start.

## Affected sites (v0.4.1, also present on `main`)

`langgraph/checkpoint/redis/aio.py`:
- L410 — read in `aget_tuple`
- L1010 — write in `aput`
- L1996 — write in `aput`'s cancelled-fallback
- L2179 — delete in `aprune`
- L2046–2048 / L2186–2189 — `pipeline.delete(latest_pointer_key)` in `adelete_thread` / `aprune`

`langgraph/checkpoint/redis/__init__.py` (sync variants): L580 / L758 / L1602 / L1788.

The SET site in `aput` is unmistakable:

```python
latest_pointer_key = f\"checkpoint_latest:{storage_safe_thread_id}:{storage_safe_checkpoint_ns}\"
await self._redis.set(latest_pointer_key, checkpoint_key)
```

— no prefix folded in.

## Impact

On a Redis Stack hosting multiple deployments (e.g. staging + prod, or multi-tenant), every saver instance reads / writes the same global `checkpoint_latest:{thread}:{ns}` keyspace.

1. **Cross-deployment overwrite.** If two deployments use overlapping `thread_id` shapes (and the LangGraph runtime gives full control of thread_id to the application — UUIDs help but don't prevent the failure mode), the last writer's pointer wins.
2. **Silent decode failure.** Once a pointer resolves to a doc key under the other env's prefix, the saver's pipeline `JSON.GET checkpoint_key` returns `None` (the doc exists, just under a different prefix). `aget_tuple` falls through to `return None` — the saver reports \"no checkpoints exist\" and the conversation silently forgets its history.
3. **No log signal.** The failure looks identical to a fresh thread, so the issue is hard to attribute. We initially mistook this for a Pregel bug.

We hit (2) in production after migrating an environment onto a Redis Stack that the other env was already using.

## Reproduction

Local Redis Stack (`docker run --rm -d -p 6379:6379 redis/redis-stack:latest`), `langgraph-checkpoint-redis==0.4.1`:

```python
import asyncio
from redis.asyncio import Redis
from langgraph.checkpoint.redis.aio import AsyncRedisSaver
from langgraph.checkpoint.base import empty_checkpoint


async def main() -> None:
    admin = Redis.from_url(\"redis://localhost:6379\", decode_responses=False)
    await admin.flushdb()

    # Two deployments sharing one Redis, each with its own prefix.
    saver_a = AsyncRedisSaver(
        redis_client=Redis.from_url(\"redis://localhost:6379\", decode_responses=False),
        checkpoint_prefix=\"env-a:lg:checkpoint\",
        checkpoint_write_prefix=\"env-a:lg:checkpoint_write\",
    )
    saver_b = AsyncRedisSaver(
        redis_client=Redis.from_url(\"redis://localhost:6379\", decode_responses=False),
        checkpoint_prefix=\"env-b:lg:checkpoint\",
        checkpoint_write_prefix=\"env-b:lg:checkpoint_write\",
    )
    await saver_a.asetup()
    await saver_b.asetup()

    cfg = {\"configurable\": {\"thread_id\": \"t1\", \"checkpoint_ns\": \"\"}}

    cp_a = empty_checkpoint(); cp_a[\"channel_values\"] = {\"owner\": \"A\"}
    cp_b = empty_checkpoint(); cp_b[\"channel_values\"] = {\"owner\": \"B\"}
    await saver_a.aput(cfg, cp_a, {}, {})
    await saver_b.aput(cfg, cp_b, {}, {})  # overwrites A's bare pointer

    print(\"A keys:\", sorted(k async for k in admin.scan_iter(match=\"env-a:*\")))
    print(\"B keys:\", sorted(k async for k in admin.scan_iter(match=\"env-b:*\")))
    print(\"bare pointer:\", await admin.get(b\"checkpoint_latest:t1:__empty__\"))

    tup_a = await saver_a.aget_tuple(cfg)
    print(\"A read:\", tup_a)  # ← None — A's checkpoint is silently dropped


asyncio.run(main())
```

Output:
```
A keys: [b'env-a:lg:checkpoint:t1:__empty__:...']   # A's doc still exists under env-a prefix
B keys: [b'env-b:lg:checkpoint:t1:__empty__:...']   # B's doc under env-b prefix
bare pointer: b'env-b:lg:checkpoint:t1:__empty__:...'  # global pointer now points at B's doc
A read: None                                        # A's aget_tuple can't decode env-b's doc
```

## Workaround

We shipped a wrapper that proxies the Redis client (and its `pipeline()`) passed into the saver. It intercepts the four ops the library does against `checkpoint_latest:*` (GET / SET / EXPIRE / DELETE) and rewrites the keys to live under our deployment's master prefix. A read-time fallback honours the legacy bare key only when the doc it points at lives under our env, so pre-deploy active threads continue working without leaking cross-env data.

This works but it's ~120 lines of proxy code + 19 unit tests + a real-Redis smoke suite + a one-shot migration script we'd rather not maintain against future versions of this library. It also broke once because our initial proxy missed `__aenter__` / `__aexit__` on the pipeline wrapper, which `redisvl/index/storage.py:awrite` requires — so this workaround is fragile in ways `checkpoint_prefix` users probably don't expect to need to know.

## Suggested fix

Either:

1. **Apply `checkpoint_prefix` to the latest-pointer keys too.** The minimal change is to extract a small helper

   ```python
   def _make_latest_pointer_key(self, thread_id: str, ns: str) -> str:
       return f\"{self.checkpoint_prefix}:latest:{thread_id}:{ns}\"
       # or any shape that's documented and namespace-scoped
   ```

   and call it from the five sites that currently inline the f-string. Existing deployments will need a one-shot migration of bare → prefixed pointers — easy to include as a `migrate_latest_pointers()` utility on the saver that scans `checkpoint_latest:*` once and rewrites each.

2. **Or, document that `checkpoint_prefix` is not sufficient to isolate multiple savers on a shared Redis Stack** and recommend separate Redis databases (`db=N`) or separate instances per deployment. This is the cheaper docs-only fix but it's a footgun for anyone discovering `checkpoint_prefix` and assuming it does what it appears to.

Option 1 is what we'd prefer (the kwarg's existence implies isolation). Happy to send a PR if there's interest — let us know if you'd want it as a single bump (with migration helper) or split.

## Environment

- `langgraph-checkpoint-redis==0.4.1` (latest tag as of 2026-05-19), verified on `main`
- `langgraph-checkpoint` 2.x
- `redis-py` 5.x, `redisvl` 0.x
- Redis Stack 7.x
- Python 3.11 (prod container) and 3.14 (dev)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkpoint_latest:* pointer keys ignore checkpoint_prefix, colliding across savers on a shared Redis Stack #187

Summary

Affected sites (v0.4.1, also present on `main`)

Impact

Reproduction

Workaround

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

checkpoint_latest:* pointer keys ignore checkpoint_prefix, colliding across savers on a shared Redis Stack #187

Description

Summary

Affected sites (v0.4.1, also present on main)

Impact

Reproduction

Workaround

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Affected sites (v0.4.1, also present on `main`)