Commit 469c606
[phase-31 4/4] PostgreSQL metastore — migration + compaction columns (#6245)
* feat: replace fixed MetricDataPoint fields with dynamic tag HashMap
* feat: replace ParquetField enum with constants and dynamic validation
* feat: derive sort order and bloom filters from batch schema
* feat: union schema accumulation and schema-agnostic ingest validation
* feat: dynamic column lookup in split writer
* feat: remove ParquetSchema dependency from indexing actors
* refactor: deduplicate test batch helpers
* lint
* feat(31): sort schema foundation — proto, parser, display, validation, window, TableConfig
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: rustdoc link errors — use backticks for private items
* feat(31): compaction metadata types — extend split metadata, postgres model, field lookup
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(31): wire TableConfig into sort path, add compaction KV metadata
Wire TableConfig-driven sort order into ParquetWriter and add
self-describing Parquet file metadata for compaction:
- ParquetWriter::new() takes &TableConfig, resolves sort fields at
construction via parse_sort_fields() + ParquetField::from_name()
- sort_batch() uses resolved fields with per-column direction (ASC/DESC)
- SS-1 debug_assert verification: re-sort and check identity permutation
- build_compaction_key_value_metadata(): embeds sort_fields, window_start,
window_duration, num_merge_ops, row_keys (base64) in Parquet kv_metadata
- SS-5 verify_ss5_kv_consistency(): kv_metadata matches source struct
- write_to_file_with_metadata() replaces write_to_file()
- prepare_write() shared method for bytes and file paths
- ParquetWriterConfig gains to_writer_properties_with_metadata()
- ParquetSplitWriter passes TableConfig through
- All callers in quickwit-indexing updated with TableConfig::default()
- 23 storage tests pass including META-07 self-describing roundtrip
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(31): PostgreSQL migration 27 + compaction columns in stage/list/publish
Add compaction metadata to the PostgreSQL metastore:
Migration 27:
- 6 new columns: window_start, window_duration_secs, sort_fields,
num_merge_ops, row_keys, zonemap_regexes
- Partial index idx_metrics_splits_compaction_scope on
(index_uid, sort_fields, window_start) WHERE split_state = 'Published'
stage_metrics_splits:
- INSERT extended from 15 to 21 bind parameters for compaction columns
- ON CONFLICT SET updates all compaction columns
list_metrics_splits:
- PgMetricsSplit construction includes compaction fields (defaults from JSON)
Also fixes pre-existing compilation errors on upstream-10b-parquet-actors:
- Missing StageMetricsSplitsRequestExt import
- index_id vs index_uid type mismatches in publish/mark/delete
- IndexUid binding (to_string() for sqlx)
- ListMetricsSplitsResponseExt trait disambiguation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(31): close port gaps — split_writer metadata, compaction scope, publish validation
Close critical gaps identified during port review:
split_writer.rs:
- Store table_config on ParquetSplitWriter (not just pass-through)
- Compute window_start from batch time range using table_config.window_duration_secs
- Populate sort_fields, window_duration_secs, parquet_files on metadata before write
- Call write_to_file_with_metadata(Some(&metadata)) to embed KV metadata in Parquet
- Update size_bytes after write completes
metastore/mod.rs:
- Add window_start and sort_fields fields to ListMetricsSplitsQuery
- Add with_compaction_scope() builder method
metastore/postgres/metastore.rs:
- Add compaction scope filters (AND window_start = $N, AND sort_fields = $N) to list query
- Add replaced_split_ids count verification in publish_metrics_splits
- Bind compaction scope query parameters
ingest/config.rs:
- Add table_config: TableConfig field to ParquetIngestConfig
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(31): final gap fixes — file-backed scope filter, META-07 test, dead code removal
- file_backed_index/mod.rs: Add window_start and sort_fields filtering
to metrics_split_matches_query() for compaction scope queries
- writer.rs: Add test_meta07_self_describing_parquet_roundtrip test
(writes compaction metadata to Parquet, reads back from cold file,
verifies all fields roundtrip correctly)
- fields.rs: Remove dead sort_order() method (replaced by TableConfig)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(31): correct postgres types for window_duration_secs and zonemap_regexes
Gap 1: Change window_duration_secs from i32 to Option<i32> in both
PgMetricsSplit and InsertableMetricsSplit. Pre-Phase-31 splits now
correctly map 0 → NULL in PostgreSQL, enabling Phase 32 compaction
queries to use `WHERE window_duration_secs IS NOT NULL` instead of
the fragile `WHERE window_duration_secs > 0`.
Gap 2: Change zonemap_regexes from String to serde_json::Value in
both structs. This maps directly to JSONB in sqlx, avoiding ambiguity
when PostgreSQL JSONB operators are used in Phase 34/35 zonemap pruning.
Gap 3: Add two missing tests:
- test_insertable_from_metadata_with_compaction_fields: verifies all 6
compaction fields round-trip through InsertableMetricsSplit
- test_insertable_from_metadata_pre_phase31_defaults: verifies pre-Phase-31
metadata produces window_duration_secs: None, zonemap_regexes: json!({})
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: rustfmt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test(31): add metrics split test suite to shared metastore_test_suite! macro
11 tests covering the full metrics split lifecycle:
- stage (happy path + non-existent index error)
- stage upsert (ON CONFLICT update)
- list by state, time range, metric name, compaction scope
- publish (happy path + non-existent split error)
- mark for deletion
- delete (happy path + idempotent non-existent)
Tests are generic and run against both file-backed and PostgreSQL backends.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(31): read compaction columns in list_metrics_splits, fix cleanup_index FK
* fix(31): correct error types for non-existent metrics splits
- publish_metrics_splits: return NotFound (not FailedPrecondition) when
staged splits don't exist
- delete_metrics_splits: succeed silently (idempotent) for non-existent
splits instead of returning FailedPrecondition
- Tests now assert the correct error types on both backends
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: rustfmt metastore tests and postgres
* fix(31): address PR review — align metrics_splits with splits table
- Migration 27: add maturity_timestamp, delete_opstamp, node_id columns
and publish_timestamp trigger to match the splits table (Paul's review)
- ListMetricsSplitsQuery: adopt FilterRange<i64> for time_range (matching
log-side pattern), single time_range field for both read and compaction
paths, add node_id/delete_opstamp/update_timestamp/create_timestamp/
mature filters to close gaps with ListSplitsQuery
- Use SplitState enum instead of stringly-typed Vec<String> for split_states
- StoredMetricsSplit: add create_timestamp, node_id, delete_opstamp,
maturity_timestamp so file-backed metastore can filter on them locally
- File-backed filter: use FilterRange::overlaps_with() for time range and
window intersection, apply all new filters matching log-side predicate
- Postgres: intersection semantics for window queries, FilterRange-based
SQL generation for all range filters
- Fix InsertableMetricsSplit.window_duration_secs from Option<i32> to i32
- Rename two-letter variables (ws, sf, dt) throughout
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: fix rustfmt nightly formatting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update quickwit/quickwit-parquet-engine/src/table_config.rs
Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
* Update quickwit/quickwit-parquet-engine/src/table_config.rs
Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
* style: rustfmt long match arm in default_sort_fields
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: make parquet_file field backward-compatible in MetricsSplitMetadata
Pre-existing splits were serialized before the parquet_file field was
added, so their JSON doesn't contain it. Adding #[serde(default)]
makes deserialization fall back to empty string for old splits.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: handle empty-column batches in accumulator flush
When the commit timeout fires and the accumulator contains only
zero-column batches, union_fields is empty and concat_batches fails
with "must either specify a row count or at least one column".
Now flush_internal treats empty union_fields the same as empty
pending_batches — resets state and returns None.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 7645703 commit 469c606
13 files changed
Lines changed: 1567 additions & 125 deletions
File tree
- quickwit
- quickwit-indexing/src/actors
- quickwit-metastore
- migrations/postgresql
- src
- metastore
- file_backed/file_backed_index
- postgres
- tests
- quickwit-parquet-engine/src
- index
- schema
- split
- storage
Lines changed: 11 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
264 | | - | |
| 264 | + | |
| 265 | + | |
265 | 266 | | |
266 | 267 | | |
267 | 268 | | |
| |||
306 | 307 | | |
307 | 308 | | |
308 | 309 | | |
309 | | - | |
| 310 | + | |
310 | 311 | | |
311 | 312 | | |
312 | 313 | | |
| |||
327 | 328 | | |
328 | 329 | | |
329 | 330 | | |
330 | | - | |
| 331 | + | |
331 | 332 | | |
332 | 333 | | |
333 | 334 | | |
| |||
336 | 337 | | |
337 | 338 | | |
338 | 339 | | |
339 | | - | |
340 | | - | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
341 | 343 | | |
342 | 344 | | |
343 | 345 | | |
344 | 346 | | |
345 | 347 | | |
346 | 348 | | |
347 | | - | |
348 | | - | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
349 | 352 | | |
350 | 353 | | |
351 | 354 | | |
352 | 355 | | |
353 | 356 | | |
354 | 357 | | |
355 | 358 | | |
356 | | - | |
| 359 | + | |
357 | 360 | | |
358 | 361 | | |
359 | 362 | | |
| |||
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
Lines changed: 40 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
Lines changed: 83 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
59 | 73 | | |
60 | 74 | | |
61 | 75 | | |
| |||
759 | 773 | | |
760 | 774 | | |
761 | 775 | | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
762 | 780 | | |
763 | 781 | | |
764 | 782 | | |
| |||
907 | 925 | | |
908 | 926 | | |
909 | 927 | | |
910 | | - | |
| 928 | + | |
911 | 929 | | |
912 | 930 | | |
913 | 931 | | |
914 | 932 | | |
915 | | - | |
916 | | - | |
917 | | - | |
918 | | - | |
919 | | - | |
920 | | - | |
921 | | - | |
922 | | - | |
923 | | - | |
924 | | - | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
925 | 959 | | |
926 | 960 | | |
927 | 961 | | |
| |||
979 | 1013 | | |
980 | 1014 | | |
981 | 1015 | | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
982 | 1054 | | |
983 | 1055 | | |
984 | 1056 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
65 | 63 | | |
66 | 64 | | |
67 | 65 | | |
| |||
75 | 73 | | |
76 | 74 | | |
77 | 75 | | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
78 | 88 | | |
79 | 89 | | |
80 | 90 | | |
81 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
82 | 115 | | |
83 | 116 | | |
84 | 117 | | |
85 | 118 | | |
86 | 119 | | |
87 | | - | |
| 120 | + | |
88 | 121 | | |
89 | 122 | | |
90 | 123 | | |
91 | 124 | | |
92 | 125 | | |
93 | | - | |
94 | | - | |
| 126 | + | |
| 127 | + | |
95 | 128 | | |
96 | 129 | | |
97 | 130 | | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
102 | 140 | | |
103 | 141 | | |
104 | 142 | | |
| |||
107 | 145 | | |
108 | 146 | | |
109 | 147 | | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
110 | 163 | | |
111 | 164 | | |
112 | 165 | | |
| |||
0 commit comments