Update AggregateEntry to Fit OTLP Trace Metrics Keys#11719
Conversation
🟡 Java Benchmark SLOs — Performance SLO warning (near threshold)
PR vs. master results
Commit: Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3be9e267af
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| String[] snapshotNames = s.peerTagSchema == null ? null : s.peerTagSchema.names; | ||
| return httpStatusCode == s.httpStatusCode | ||
| && synthetic == s.synthetic | ||
| && contentEquals(origin, s.origin) |
There was a problem hiding this comment.
Avoid keying native stats by arbitrary origin
When the current/native metrics path is used, SerializingMetricWriter.add still serializes only the derived Synthetics boolean and never the full origin, so adding the full origin to the AggregateEntry match/hash makes non-synthetic origins such as rum, ciapp-test, or user-supplied _dd.origin values occupy separate aggregate slots even though they produce identical msgpack keys. A workload receiving many distinct origins can now exhaust maxAggregates and drop client stats; keep this split to the OTLP writer/table or normalize the native key back to the synthetics boolean.
Useful? React with 👍 / 👎.
|
I'm 75% through reworking this code. I'm trying to decide how best to slot this in. |
|
@dougqh Happy to get this PR to a mergable state and wait to merge after your changes go in. I took a quick look and it doesn't look like your PR introduces anything that would logically prevent my changes from working. Would you be able to confirm that? 🙇♂️ |
@mhlidd Yeah, after thinking it through, I'd like to land #11387 first -- precisely because it gives us a safety mechanism that is important there. Because origin is externally controlled, I want to be careful how we handle it. Right now, the DDCache can constantly churn, so an external actor can still cause constant UTF8BytesString allocation. #11387 introduces a cardinality limiter that allows us to clamp allocation, so blunt that attack vector. |
What Does This Do
This PR updates
AggregateEntryto properly be able to capture separate counts for OK and Error latencies, as well as properly capturing the entireoriginspan tag, rather than just the original boolean of whetherorigin == synthetic. As part of this refactor, all helper tests also account fororiginas a String object rather than a boolean trackingsynthetic.Native CSS path still only checks whether
origin == synthetic, while OTLP will use the full origin value.Motivation
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issue/merge. You can also:/merge --commit-message "..."/merge -c/merge -f --reason "reason"; please use this judiciously, as some checks do not run at the PR-level (note: the PR still needs to be mergeable, this will only skip the pre-merge build)Jira ticket: [PROJ-IDENT]