feat(server): Adds a tag to RDB entry by abhijat · Pull Request #6926 · dragonflydb/dragonfly

abhijat · 2026-03-19T08:49:43Z

WIP

Serializer on flush prefixes an envelope: opcode,tag,size to the chunk for baseline (key,value) entries.

The tag for baseline data is a monotonic increasing stream id of 4 bytes. A new one is assigned per SaveEntry call (so each key).

Journal and other non baseline types use a sentinel value and are not tagged but still stashed

To guard against interleaved (baseline - journal, baseline - baseline) entries caused by eg yield of large SaveEntry, a stash is added to serializer, and a new state: current_stream_id_

WriteJournalEntry sets this current stream id sentinel value up front
before yield (consume_fun_) we store the current stream id
after yield if the current stream id changed (journal entry added while fiber was yielded/another key got serialized) the memory buffer content is stashed with the correct stream id
when flushing the stash which already contains tagged chunks is prefixed into the current data in mem buffer (which is also tagged during flush)
The flush decision (size threshold) takes into account both mem buffer and the stash size
These operations in serializer are gated by the boolean flag from setting serialization_tagged_chunks

The compression of entries in stash is skipped. It might be possible to do this but currently the number_of_chunks_ field is saved and restored across stash, because the stash function uses PrepareFlush

TODO in this PR:

tests

abhijat · 2026-03-19T08:52:54Z

this is one alternative approach to #6906

abhijat · 2026-03-19T08:58:01Z

will add more details in #6831

romange · 2026-03-19T09:52:01Z

It's not true that we need it only for replication with stream_journal.

We may have interleaved streams if for example two fibers try to write values during backup. One of them is serializing a huge value and another - another entry, which it can slip in between the serialization of the huge value if we lift bi_value_m_

I wrote explicitly that the acceptance criteria is that rdb_tests (snapshotting) should pass with the new format.

abhijat · 2026-03-21T07:22:17Z

It might be better to tag on bucket ids (+ one fixed tag for journal) than just baseline. This way for example:

one bucket X is being iterated. it yields midway saving a big entry - fiber A
an ondbchange fires for bucket Y. as per https://github.com/dragonflydb/dragonfly/blob/main/docs/shard-serialization.md this is legal to write to serializer because bucket Y state is NotCovered. so it goes ahead and writes some data to serializer then yields mid SaveEntry (eg big hmap) - fiber B
fiber A resumes, writes more data to serializer -> memory buffer. finishes
fiber B resumes, writes rest of bucket Y to serializer. finishes

Now the serializer data looks like:

[XXXXYYYXXX] which is corrupt even though all tags are correctly marked baseline. If we tag by bucket id (a monotonic increasing 4 byte int) instead, and use the stashing system already in this PR, then the receiver can reassemble chunks for diff. buckets and apply them separately, as well as play journal entries correctly. We also need to mark end of a bucket by something like a 0 length chunk tagged by the bucket id (at the end of saving bucket)

For this to work the loader has to now maintain one loader per bucket though (max number of loaders at a time = max number of buckets in stream at a time), and the end marker can tell when a bucket is done and the loader can be torn down or reused.

romange · 2026-03-21T07:51:07Z

but how come bucket ids are related to it? we just need to reassemble a single value. a bucket holds multiple values. we can have a unique tag per key. a key can be such a tag, for example.

abhijat · 2026-03-21T11:30:06Z

but how come bucket ids are related to it? we just need to reassemble a single value. a bucket holds multiple values. we can have a unique tag per key. a key can be such a tag, for example.

There are two reasons I thought of buckets

In this PR I stashed the content of the mem buf in serializer into a tagged chunk, when the tag changes. so if the tag changes per bucket, and if a bucket is serialized without yield then it gets sent as a single chunk, no stashing or tagging. With keys there will be a chunk which will be stashed for each key in each bucket. But it will still work.
I looked at the bucket commands produced by SerializeBucket as a sequence of the stream that combined together and applied, but a key works the same way.

Looking at the loader side, because we already have support for incremental loading of keys using now_chunked_, it might be much easier to reassemble if the tag is mapped to key ( and db or any other information which might make it unique).

The main thing that will change in the loader is

right now it assumes in LoadKeyValPair that a key's content is contiguous in the stream.
with tagged chunks, this is no longer true, so the LoadKeyValPair loop needs an additional condition to stop: if chunk size is exhausted, and store the partially read key in the now_chunked_ map and process the remaining stream.
Later when a section of the stream is seen for the same key again then the key is picked from the map and appended to, which already happens in FromOpaque etc

So with key as tag (or a counter that maps to key) the end chunk marker is also not needed. The incremental parsing in LoadConfig will already take care of that.

abhijat · 2026-03-21T11:44:22Z

Also if the baseline entries are tagged by key, journal entries do not need to be tagged. They are already defined by opcode and never split. So they will be treated distinctly from entries tagged on key by the loader correctly without chunk header.

abhijat · 2026-03-21T15:22:00Z

Changes for loader to do next in this PR, rough steps:

in the main loader loop, if RDB_OPCODE_TAGGED_CHUNK seen, read stream id, payload size.
Fall through to loading kv pair, check stream state map (new map keyed by stream id)
If stream id not in stream state map, load key, val both. Cut off loadkvpair loop at payload size (update using mem buf inputlen?).
Store the partial entry state (pending read remaining etc) in map
If stream id in state map, dont read key, but read partial value (similar to LoadKeyValPair - extract a helper), use same CreateObjectOnShard etc for incremental parsing with some changes needed to update stream state map.
Once read finishes (after potentially many chunks) CreateObjectOnShard will save the entry. Remove it from stream state.

abhijat · 2026-03-23T08:18:51Z

The lower level read functions in rdb load (ReadSet etc) also need to stop at the payload size, not just "n" items. we already have the condition in loadkeyvalpair etc. The same conditions need to be pushed down into the lower layers too

abhijat · 2026-03-25T10:03:50Z

At this point the one crucial thing to add to this PR is tests that actually interleave data across keys/buckets ie preempt and write

abhijat · 2026-03-26T11:48:12Z

in the latest version:

stream id is now optional - only used for entries which split on Push->Flush->Yield
serializer self assigns stream id on potential split - caller does not need to
those entries which fit in one chunk and finish without reaching flush boundary are not tagged
journal entries/other non data entries like journal offset etc are not tagged

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

abhijat · 2026-03-27T15:39:23Z

new test interleaves multiple keys forcefully and inserts journal entries and offset commands around each chunk of key

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch from 41cce0b to b002506 Compare March 19, 2026 08:50

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch from b002506 to d300a9c Compare March 19, 2026 10:49

abhijat mentioned this pull request Mar 19, 2026

shard-serialization P1.4: extend replication wire protocol with tagged chunks #6831

Open

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 4 times, most recently from 541cb7e to cde5f4a Compare March 20, 2026 12:00

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 5 times, most recently from a5a81cb to 9efb38b Compare March 22, 2026 08:39

abhijat commented Mar 22, 2026

View reviewed changes

Comment thread src/server/rdb_load.cc Outdated

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 5 times, most recently from cbc4da0 to b7d6925 Compare March 23, 2026 07:12

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 3 times, most recently from bb5d11f to ebb14b3 Compare March 23, 2026 16:16

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 5 times, most recently from 0bb5174 to eee8dc4 Compare March 25, 2026 09:58

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 3 times, most recently from b56a290 to 28a50d1 Compare March 26, 2026 11:14

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch from 1f3b918 to 1d38527 Compare March 27, 2026 07:51

abhijat added 5 commits March 27, 2026 13:31

server: add tagged chunk loading support

ed0d6f3

server: add tagged chunk serialization support

59c09a1

tests: add tagged chunk tests

0f28868

server: Change to tag only necessary data

521ef61

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

tests: Test for chunked write-read

626e39c

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch from 1d38527 to bf1687f Compare March 27, 2026 08:01

abhijat added 2 commits March 27, 2026 15:36

server: cleanup tagging method

8f7021e

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

tests: Add test which interleaves many keys at once

debd811

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch from bf1687f to debd811 Compare March 27, 2026 10:06

abhijat added 2 commits March 31, 2026 10:36

server: Cleanup helper methods

069b4a4

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

server: Cleanup helper methods

9afe8a8

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch 4 times, most recently from ee4617d to d3e1760 Compare April 2, 2026 15:26

server: Use simpler buffer switch design

33136a3

Signed-off-by: Abhijat Malviya <abhijat@dragonflydb.io>

abhijat force-pushed the abhijat/feat/tagged-chunk-rdb-format branch from d3e1760 to 33136a3 Compare April 3, 2026 08:29

abhijat mentioned this pull request Apr 3, 2026

[WIP] feat(server): experiment PR to tag chunks and read them #7063

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(server): Adds a tag to RDB entry#6926

feat(server): Adds a tag to RDB entry#6926
abhijat wants to merge 10 commits intomainfrom
abhijat/feat/tagged-chunk-rdb-format

abhijat commented Mar 19, 2026 •

edited

Loading

Uh oh!

abhijat commented Mar 19, 2026

Uh oh!

abhijat commented Mar 19, 2026

Uh oh!

romange commented Mar 19, 2026

Uh oh!

abhijat commented Mar 21, 2026

Uh oh!

romange commented Mar 21, 2026

Uh oh!

abhijat commented Mar 21, 2026 •

edited

Loading

Uh oh!

abhijat commented Mar 21, 2026

Uh oh!

abhijat commented Mar 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

abhijat commented Mar 23, 2026

Uh oh!

abhijat commented Mar 25, 2026

Uh oh!

abhijat commented Mar 26, 2026

Uh oh!

abhijat commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

abhijat commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhijat commented Mar 19, 2026

Uh oh!

abhijat commented Mar 19, 2026

Uh oh!

romange commented Mar 19, 2026

Uh oh!

abhijat commented Mar 21, 2026

Uh oh!

romange commented Mar 21, 2026

Uh oh!

abhijat commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhijat commented Mar 21, 2026

Uh oh!

abhijat commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

abhijat commented Mar 23, 2026

Uh oh!

abhijat commented Mar 25, 2026

Uh oh!

abhijat commented Mar 26, 2026

Uh oh!

abhijat commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abhijat commented Mar 19, 2026 •

edited

Loading

abhijat commented Mar 21, 2026 •

edited

Loading

abhijat commented Mar 21, 2026 •

edited

Loading