Speed up OSM processing some more by leonardehrenfried · Pull Request #7583 · opentripplanner/OpenTripPlanner

leonardehrenfried · 2026-05-04T09:31:34Z

Summary

Just as in my previous PR, I am rewriting some hot code paths in the OSM processing code.

The optimizations boil down to:

replacing Streams with some manual processing, thereby lowering allocations
checking for tagless entities early and returning the methods: this is particularly important for instances of OsmNode, where the vast majority doesn't have any tags at all.
packing the coordinates of a way's line string into a tight array rather than instances of Coordinate
making some methods final

All of this yields about a 20-25% speed up.

# Conflicts: # application/src/main/java/org/opentripplanner/osm/model/OsmEntity.java

leonardehrenfried · 2026-05-04T09:32:09Z

@VillePihlava @abyrd Could I interest you in reviewing this one as well?

VillePihlava · 2026-05-04T09:34:52Z

I can take a look

codecov · 2026-05-04T09:37:09Z

Codecov Report

❌ Patch coverage is 96.66667% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.54%. Comparing base (481d2ef) to head (bb2dc02).

Files with missing lines	Patch %	Lines
.../java/org/opentripplanner/osm/model/OsmEntity.java	93.33%	1 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             dev-2.x    #7583   +/-   ##
==========================================
  Coverage      72.54%   72.54%           
- Complexity     21266    21271    +5     
==========================================
  Files           2380     2380           
  Lines          86400    86427   +27     
  Branches        8486     8489    +3     
==========================================
+ Hits           62675    62701   +26     
+ Misses         20728    20727    -1     
- Partials        2997     2999    +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

VillePihlava

Nice improvements. Should private final Map<String, String> tags; in OsmEntity be defined as non-nullable? This way the use of isTagless would always indicate a performance concern.

VillePihlava · 2026-05-05T14:24:38Z

+  @Nullable
  private final Map<String, String> tags;


Would it make sense to make this non-nullable? Constantly calling the isTagless function feels redundant. Something like a static default empty map that this gets initialized to when no tags are present. Then the isTagless function can be used in places where it is necessary for performance reasons, not for null checking.

This is a micro(?)-optimization. If you use a Map you still have to call toLowerCase and compute the hash of the string.

I know that using null is not super-idiomatic in our current code base (I'm its biggest hater) but in this case it's worth it for performance.

If it makes sense for performance reasons, I think its fine. Maybe a comment should be added every time it is used as a performance optimization?

Another way would be to create a custom map implementation that null checks behind the scenes. Do whatever you think makes sense

VillePihlava · 2026-05-07T06:49:08Z

+  @Nullable
  private final Map<String, String> tags;


If it makes sense for performance reasons, I think its fine. Maybe a comment should be added every time it is used as a performance optimization?

VillePihlava · 2026-05-07T06:51:40Z

+  @Nullable
  private final Map<String, String> tags;


Another way would be to create a custom map implementation that null checks behind the scenes. Do whatever you think makes sense

VillePihlava

LGTM! Lets consider at some point whether the tag.toLowerCase() calls are really necessary.

abyrd · 2026-05-13T14:36:41Z

-          geometry = GeometryUtils.getGeometryFactory().createLineString(
-            segmentCoordinates.toArray(new Coordinate[0])
-          );
+          geometry = GeometryUtils.makeLineString(segmentCoordinates.toArray());


If only Java had compound value types... any day now.

abyrd · 2026-05-13T15:07:19Z

+  /**
+   * Does the entity contain any tags at all?
+   */
+  final boolean isTagless() {


My understanding from other review comments is that the nullable tags field and this function exist largely to avoid String.toLowercase calls. As far as I know, OSM tags are case-sensitive and the vast majority of tag keys are lower case. Any non-lower-case version of a typically lower-case key would be considered a new nonstandard key and show up as such in the tag stats. The case insensitivity in OTP appears to be an adaptation to poor data quality in the early era of OTP, and may date all the way back to #500. I suspect OSM data quality has increased significantly in the intervening 15 years.

Would it be possible to performance-test a version using a canonical empty map instead of null and eliminating all toLowerCase calls and all null checks? I do wonder if it would be just as fast and significantly simpler.

Okay, this appears to bother both you and @VillePihlava quite a bit, so let me give a long form explanation.

For well over a decade null was used in the OSM code to mean "no tags". I started using the empty map in my previous OSM PR #7537 but later noticed that there was a performance problem with it. For this reason, I re-established the previous state of using null. I understand that it's not super-idiomatic to how we use collections nowadays. I don't have exact numbers but I could see a small speed up when switching back to null.

Besides, it's a well-encapsulated variable and the null never escapes this class.

Your point about lowercasing is correct, but when I tried switching it off I noticed that a test failed so I never pursued that avenue any further. There are a handful of tags we use like ref:IFOPT (https://www.openstreetmap.org/way/27558650) that are frequently written in the incorrect case and the lowercasing is still needed somewhere.

It could probably be removed in the hottest inner loop, but it requires some risk management and careful investigation.

In any case, that would be a functional change that I would like to keep separate from a performance optimization that aims to keep everything as it was before.

Given all of the above, I politely request that you let me keep the method/null value as-is.

If you feel that it ought to be done, I can do that in a follow-up.

There is one more aspect to isTagless: it is used in some complex methods that check a variety of tags to short-circuit the computation. This is a significant speed-up.

No, use of null references doesn't particularly bother me. My review was a comment, not a request for changes to your PR. I had the impression complexity was being introduced just to continue supporting case-insensitive behavior in a place that is by specification supposed to be case-sensitive, and if the case-insensitive behavior was unnecessary maybe the problem could be eliminated. Since you've already considered this and decided case-insensitive behavior still needed, no change is necessary.

leonardehrenfried added 10 commits May 4, 2026 11:07

Replace streams in ExactMatchSpecifier

09ec844

Cache some tag look ups

28fd5a6

# Conflicts: # application/src/main/java/org/opentripplanner/osm/model/OsmEntity.java

Speed up getMulitTagValues

f34d682

Speed up boarding locations

50af031

Fix after rebase

078e5e4

Skip some tag lookups

f60dfeb

Use null instead of empty map

fd19c8e

Add another shortcut

2ac5c55

Add nullable annotation, use TDoubleArrayList

e21b492

Make hot methods final

a89f768

leonardehrenfried requested a review from a team as a code owner May 4, 2026 09:31

leonardehrenfried added the !Optimization The feature is to improve performance. label May 4, 2026

leonardehrenfried added this to the 2.10 (next release) milestone May 4, 2026

VillePihlava self-assigned this May 4, 2026

VillePihlava self-requested a review May 4, 2026 09:35

Improve Javadoc

e1c070e

optionsome assigned abyrd May 5, 2026

optionsome requested a review from abyrd May 5, 2026 08:40

leonardehrenfried enabled auto-merge May 5, 2026 10:52

VillePihlava requested changes May 5, 2026

View reviewed changes

leonardehrenfried added 3 commits May 6, 2026 10:00

Review: Rename method

96ea12d

Small clean up

8d3c685

Review: Simplify if/else

5cd160d

VillePihlava reviewed May 7, 2026

View reviewed changes

leonardehrenfried added 2 commits May 7, 2026 10:36

Review: Remove else

31dc111

Add Javadoc

17bb76a

leonardehrenfried requested a review from VillePihlava May 7, 2026 08:38

Add more comments about isTagless

741db14

VillePihlava previously approved these changes May 8, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/dev-2.x' into osm-speed-up

a4cdee8

leonardehrenfried dismissed VillePihlava’s stale review via a4cdee8 May 12, 2026 09:08

leonardehrenfried requested a review from VillePihlava May 12, 2026 09:08

Merge remote-tracking branch 'upstream/dev-2.x' into osm-speed-up

bb2dc02

VillePihlava approved these changes May 12, 2026

View reviewed changes

abyrd reviewed May 13, 2026

View reviewed changes

abyrd approved these changes May 13, 2026

View reviewed changes

leonardehrenfried added this pull request to the merge queue May 13, 2026

Merged via the queue into opentripplanner:dev-2.x with commit 030eabb May 13, 2026
8 checks passed

leonardehrenfried deleted the osm-speed-up branch May 13, 2026 17:30

t2gran pushed a commit that referenced this pull request May 13, 2026

Add changelog entry for #7583 [ci skip]

27ce7d1

Conversation

leonardehrenfried commented May 4, 2026

Summary

Uh oh!

leonardehrenfried commented May 4, 2026

Uh oh!

VillePihlava commented May 4, 2026

Uh oh!

codecov Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

VillePihlava left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leonardehrenfried May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

VillePihlava left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 4, 2026 •

edited

Loading

leonardehrenfried May 6, 2026 •

edited

Loading