Skip to content

Minor fixes to stop buffering#642

Merged
irees merged 2 commits into
mainfrom
stop-buffer-fixes
Jun 2, 2026
Merged

Minor fixes to stop buffering#642
irees merged 2 commits into
mainfrom
stop-buffer-fixes

Conversation

@irees

@irees irees commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes per-stop census geography attribution when resolving census geographies for multiple stops in a single batched query. Previously the batched stop path relied on a buffer CTE that unioned all input stops into one polygon, which lost the link back to the individual stop that matched each census tract. This adds a per-stop attribution mode so a single query can be grouped back to the correct requesting stop, while preserving the existing union behavior for routes and agencies.

Per-stop attribution in the buffer CTE

  • Adds a perStopAttribution flag to censusGeographySelectFields. It is caller-driven (not GraphQL/field-selection driven) and only affects how the stop_buffer location filter builds its buffer CTE.
  • When set, the buffer CTE emits one buffered row per stop carrying match_entity_id = gtfs_stops.id, instead of unioning all stops into a single polygon. Tracts that intersect multiple input stops are intentionally duplicated so callers get per-stop apportionment.
  • The select also projects buffer.match_entity_id when per-stop attribution is enabled, so the result rows can be bucketed back to the originating stop.

CensusGeographiesByEntityIDs grouping

  • For entityType == "stop", sets perStopAttribution and runs one batched query for all requested stops. Grouping back to each stop is handled by arrangeGroup via the SQL-provided match_entity_id, eliminating the previous loop-then-tag pass.
  • For routes/agencies, keeps the existing one-query-per-entity behavior with the unioned buffer (avoids double-counting tracts hit by multiple stops in the entity's stop set) and tags MatchEntityID after scan, since the unioned buffer emits 0.
  • Clarifies the existing default union path is also used by top-level aggregation queries.

Test plan

  • Query census geographies for multiple stops via GraphQL and confirm each stop returns the tracts intersecting its own buffer (including tracts shared by multiple stops appearing under each relevant stop).
  • Query census geographies for a route and an agency and confirm results are the union over their stop sets with no double-counted tracts.
  • Compare batched stop results against per-stop queries to confirm attribution matches.

@irees irees marked this pull request as ready for review June 2, 2026 10:13
Copilot AI review requested due to automatic review settings June 2, 2026 10:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts census geography spatial querying so stop-based stop_buffer lookups can be done in a single batched query while still attributing each returned geography back to the originating stop.

Changes:

  • Add an internal perStopAttribution flag to preserve match_entity_id per stop (instead of unioning stop buffers).
  • Update the stop_buffer CTE generation to optionally emit one buffer per stop, enabling correct post-query grouping by stop ID.
  • Select buffer.match_entity_id when per-stop attribution is enabled so results can be arranged per requesting stop.
Comments suppressed due to low confidence (1)

server/finders/dbfinder/census.go:92

  • In the batched stop path, the SQL LIMIT is applied to the combined result set across all requested stops. With perStopAttribution duplicating rows per intersecting stop, this can under-fetch and cause some stops to return fewer than the requested limit (because truncation happens before results are grouped per stop). Consider scaling the query limit by len(entityIds) (or otherwise over-fetching) for entityType == "stop" so each stop can still receive up to limit rows after grouping.
		fields.perStopAttribution = true
		var ents []*model.CensusGeography
		pw := forStopids(entityIds)
		if err := dbutil.Select(ctx, f.db, censusDatasetGeographySelect(limit, pw, fields), &ents); err != nil {
			return nil, logErr(ctx, err)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@irees irees merged commit 72b593a into main Jun 2, 2026
7 checks passed
@irees irees deleted the stop-buffer-fixes branch June 2, 2026 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants