Performance stuff #416

N-Wouda · 2023-12-11T15:10:25Z

Discussion thread for ideas to improve PyVRP's performance. Could be low-level stuff, like better code optimisation, smaller data types, and/or caching. Or a smarter data structure, better algorithm, etc. Anything's in scope.

Edit: there's also perhaps some stuff in #181 that we might want to implement as part of this.

N-Wouda · 2023-12-11T15:18:15Z

One idea that might help a bit relates to how we skip move evaluation when nothing's changed since the last time we considered the client pairs $U$ and $V$:

PyVRP/pyvrp/cpp/search/LocalSearch.cpp

Lines 92 to 93 in 7604244

    
           if (lastModified[U->route()->idx()] > lastTestedNode 
        
               || lastModified[V->route()->idx()] > lastTestedNode)

This works fine, but any change in the routes of $U$ and $V$ are now reason to re-evaluate. Even when that change happens many nodes after $U$ or $V$, which has no consequences for the evaluation.

We might make this more efficient by taking into account the size of our move operators (e.g., 1, 2, 3, etc. nodes), and checking whether there has been a recent modification that affected nodes sufficiently near either $U$ or $V$ to warrant a new evaluation.

I think I know how to implement this efficiently, but I'm lacking the time right now to pursue it. This will probably have to wait until PyVRP 0.8.0.

wouterkool · 2023-12-13T10:23:18Z

This works fine, but any change in the routes of U and V are now reason to re-evaluate. Even when that change happens many nodes after U or V, which has no consequences for the evaluation.

I like the idea but I'm not sure if this is correct? At least with time windows, removing a node towards the end of the route may create slack which enables inserting a customer early in the route that would previously result in a large penalty, even if the affected parts of the route are far apart.

This situation may be hypothetical or not happen frequently so we may decide heuristically to skip those options but that's a functional change of the algorithm, not a more efficient implementation with the same result. Or am I missing something?

N-Wouda · 2023-12-13T12:10:00Z

This situation may be hypothetical or not happen frequently so we may decide heuristically to skip those options but that's a functional change of the algorithm, not a more efficient implementation with the same result. Or am I missing something?

Not really. This conclusion seems correct to me.

N-Wouda · 2024-02-29T10:27:17Z

Much more aggressive caching is also in scope here. Routes are queried at least two orders of magnitudes more than that they are changed. Every time something between two indices is calculated, can we somehow store the result? Is that worthwhile?

N-Wouda · 2024-03-27T16:53:47Z

Also relevant stuff that @leonlan and I discussed earlier today:

Restarts: how often, after what (no improv local best in this restart run, or global best so far over all restarts?)
SREX: can we limit the time spent in the subsequent local search? (see also Heuristic for SREX subproblem #521)
Improve our neighbourhood strategy somehow? I know FILO has a much more advanced neighbourhood system, and we might be able to learn something from how that works exactly.
Really think about what route neighbours are in Route neighbours #126, and see if we can somehow use that as well.

leonlan · 2024-03-28T07:22:41Z

Improve our neighbourhood strategy somehow? I know FILO has a much more advanced neighbourhood system, and we might be able to learn something from how that works exactly.

Leaving out some details, FILO uses a granular neighborhood that is 1) node-based and 2) dynamic.

Node-based means that the number of neighbors is different per node. Let $n_{\text{neighbors}}$ denote the maximum number of neighbors. The active number of neighbors is between $[1, n_{\text{neighbors}}]$ and is dynamically updated.
Based on whether a node is involved local search moves is improving, the number of neighbors is updated. This is done using a "sparsification factor" $\gamma_i \in [0, 1]$ for each node $i$. The number of neighbors at any time is given by $\gamma_i * n_{\text{neighbors}}$ and $\gamma_i^0$ being the initial sparsification factor.

If node was involved in an improving move/solution, then reset $\gamma_i = \gamma_i^0$.
Otherwise, update the $\gamma_i = f \gamma_i$ with updating factor $f > 1$.

I don't really understand yet what the "node involved in improving move" means, but these two improvements will likely improve the solver.

Implementation details:

LocalSearch.search should take a granular neighborhood as input rather than making it part of its state.
We may need something like a NeighbourhoodManager that manages the granular neighborhood.

N-Wouda · 2024-11-17T14:19:36Z

Do people actually use add_node_operator and add_route_operator to add custom node/route operators? Or do they just use whatever default operators we have selected? I suspect the latter. If that's the case we might want to investigate how much it'd help the compiler optimise further if we bring all operators back into the LocalSearch object.

That is probably worth investigating regardless of whether we want to do it.

I'll have a look at this sometime soon.

N-Wouda · 2025-05-07T13:27:16Z

I'll have a look at this sometime soon.

I did this a few months ago, and concluded that it doesn't matter much.

N-Wouda · 2025-05-07T13:28:54Z

Much more aggressive caching is also in scope here. Routes are queried at least two orders of magnitudes more than that they are changed. Every time something between two indices is calculated, can we somehow store the result? Is that worthwhile?

Also tried this, and the extra indirection of drawing data from memory instead of just computing it again as needed seems to void any potential gains here.

leonlan · 2025-05-11T02:13:31Z

Have we ever tried caching SegmentBetweens for duration segment? I don't recall so. [Vidal et al. (2013)] mentions that they cache subsequences up to size 20. Could this help improve performance?

Vidal, T., Crainic, T. G., Gendreau, M., & Prins, C. (2013). A hybrid genetic algorithm with adaptive diversity management for a large class of vehicle routing problems with time-windows. Computers & Operations Research, 40(1), 475–489. https://doi.org/10.1016/j.cor.2012.07.018

N-Wouda · 2025-05-11T08:25:15Z

I have tried doing that lazily (cache when requested, re-use for future requests). That did not help much for the instances I tested. For instances with very long routes it might be helpful, but I don't know if it's worth it to us. The concatenation of SegmentBetween is typically fairly fast already, despite its linear time.

N-Wouda · 2025-05-28T11:20:48Z

Now with #821 on main, I ran a small experiment with a VRPTW instance (1000 clients). We do about 30 million operator evaluations per second on my machine, but only 0.05% of those are improving moves. So, while we do a lot, and fast, the whole thing is also basically dumb as bricks.

I don't have high hopes we can improve that 0.05% massively, but I do think we can reduce the number of evaluations quite a bit by being a little smarter. Right now, for example, we reset the last modified and last tested numbers each time we call LocalSearch::search and LocalSearch::intensify. But we do this in a loop in LocalSearch::operator(), and nothing is invalidated between calls. So there is no reason to re-evaluate all those unchanged moves.

If we can somehow maintain the last modified and tested numbers between calls we should be able to filter out a lot of pointless move evaluations.

N-Wouda · 2025-05-29T15:03:36Z

#215 tracks the neighbourhood structure. I think we can close this - a lot of it is already tackled via ILS in #533.

N-Wouda added the discussion Discussion thread label Dec 11, 2023

N-Wouda added the performance Make things better and faster! label Mar 30, 2024

This comment was marked as resolved.

Sign in to view

This was referenced Nov 8, 2024

Do not deref route_ when determining when node is a depot node #679

Merged

Simplify Route updates #686

Merged

Remove index members from concatenation schemes #687

Merged

N-Wouda added this to the v0.11.0 milestone Nov 17, 2024

This was referenced Nov 17, 2024

Simplify exchange a little #698

Merged

Do not use isDepot check in Exchange #699

Merged

N-Wouda mentioned this issue Dec 11, 2024

Penalty per load dimension #713

Merged

N-Wouda removed this from the v0.11.0: Multi-trip VRP milestone May 14, 2025

N-Wouda mentioned this issue May 28, 2025

Reset last tested and modified only when loading a solution #823

Merged

3 tasks

N-Wouda closed this as completed May 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance stuff #416

Performance stuff #416

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Performance stuff #416

Performance stuff #416

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!