Skip to content

perf(matrices): cache MatrixAccessor properties with cached_property#716

Open
MaykThewessen wants to merge 1 commit into
PyPSA:masterfrom
MaykThewessen:perf/matrix-accessor-cached-properties
Open

perf(matrices): cache MatrixAccessor properties with cached_property#716
MaykThewessen wants to merge 1 commit into
PyPSA:masterfrom
MaykThewessen:perf/matrix-accessor-cached-properties

Conversation

@MaykThewessen
Copy link
Copy Markdown
Contributor

Summary

Convert the remaining @property methods on MatrixAccessor (c, Q, sol, dual) to functools.cached_property. After #630, vlabels, clabels, A, b, sense, lb, ub, vtypes are already materialised once in __init__. The four remaining properties still recompute on every access, which shows up in solver paths that read the same attribute twice on one MatrixAccessor instance.

Also fix a small adjacent issue in expressions.BaseExpression._map_solution: m.matrices.sol followed by m.matrices.vlabels quietly constructed two MatrixAccessor instances (each running _build_vars + _build_cons). Binding to a local M = m.matrices reuses one.

Why this is safe

Each MatrixAccessor is short-lived: Model.matrices is itself a @property that returns a fresh instance on every access (linopy/model.py:350-352). So the cache lives at most as long as the local variable a caller holds, and the accessor never sees a mutated model. No new clean_cached_properties plumbing is needed.

Where the gain shows up

Multiple solver builders read the same property twice within one M = model.matrices:

  • solvers.Highs._build_solver_modelM.c at line 1278, M.Q at 1299; both can be re-read in user code that inspects the accessor after solve.
  • solvers.Gurobi._build_solver_modelM.Q at 1586, used again at 1587 in the QP branch.
  • solvers.Mosek._build_solver_modelM.Q at 2860, again at 2861.

Repeat reads of M.sol / M.dual are also common in post-solve introspection.

Benchmark

Model: 360,600 variables, 1,200 constraints, solved with HiGHS (Python 3.12, scipy 1.17, numpy 2.4):

cold (first access) warm (repeat access)
master 1.0 ms 1.1 ms (recomputed)
patch 1.2 ms 0.4 µs (~2500×)

Repro script:

import numpy as np, pandas as pd, xarray as xr, time
from linopy import Model

N = 600
m = Model()
idx = pd.RangeIndex(N); jdx = pd.RangeIndex(N)
x = m.add_variables(xr.DataArray(np.zeros((N, N)), coords=[idx, jdx]),
                    xr.DataArray(np.full((N, N), 100.0), coords=[idx, jdx]), name="x")
y = m.add_variables(0, 100, coords=[idx], name="y")
m.add_constraints(x.sum(dim="dim_1") + y >= 10)
m.add_constraints(x.sum(dim="dim_0") >= 1)
m.add_objective(x.sum() + 2 * y.sum())
m.solve(solver_name="highs", io_api="direct", log_to_console=False)

for attr in ["c", "sol"]:
    M = m.matrices
    t = time.perf_counter(); _ = getattr(M, attr); cold = time.perf_counter() - t
    t = time.perf_counter(); _ = getattr(M, attr); warm = time.perf_counter() - t
    print(f"M.{attr}: cold={cold*1e3:.2f} ms  warm={warm*1e6:.2f} us")

Tests

  • pytest test/test_matrices.py test/test_objective.py test/test_solution_lookup.py test/test_model.py test/test_constraints.py test/test_linear_expression.py test/test_quadratic_expression.py → 399 passed
  • pytest test/test_optimization.py -k "not modified_model and not netcdf" → 241 passed, 2 skipped

(Pre-existing master failures unrelated to this PR: test_modified_model[*] and test_model_to_netcdf_* — both fail on upstream/master with the same scipy/numpy stack.)

Files touched

  • linopy/matrices.py (+4 @cached_property, +1 import)
  • linopy/expressions.py (1 line: bind m.matrices to local before reading two attributes)

Convert the remaining `@property` methods on `MatrixAccessor` (`c`, `Q`, `sol`,
`dual`) to `@cached_property`. After PyPSA#630, vlabels/clabels/A/b/sense/lb/ub are
materialised once in `__init__`, but these four properties still recomputed on
every access. Each `MatrixAccessor` is short-lived (rebuilt on every
`model.matrices` call), so cache invalidation is moot: the instance never sees
a mutated model.

Also fix a latent double-build in `expressions.BaseExpression._map_solution`:
`m.matrices.sol` followed by `m.matrices.vlabels` constructed two accessors
(each running `_build_vars`+`_build_cons`). Bind once to `M`.

Benchmark on a 360k-var / 1.2k-constraint model (Python 3.12, scipy 1.17,
numpy 2.4):

           cold (first)   warm (repeat)
  master:    1.0 ms          1.1 ms      (recomputed)
  patch:     1.2 ms          0.4 us      (~2500x on warm)

Repeat access happens in `solvers.Highs._build_solver_model` (M.c / M.Q both
read at lines 1278/1299 and again in Gurobi/Mosek paths at 1586-1587, 2860-2861)
and anywhere a caller assigns `M = model.matrices` then reads multiple times.
@FabianHofmann
Copy link
Copy Markdown
Collaborator

@MaykThewessen thanks for the pr. if I am not mistaken we explicitly decided not using the cached property (@coroa) due to unexpected non-aligned result when underlying data changed. we would need to make sure that the cached is released properly. we can do this in #718 in the update function or alternatively make sure here we get rid of the the duplicated calls

@coroa
Copy link
Copy Markdown
Member

coroa commented Jun 1, 2026

yes, this very explicitly is not a good idea.

The main reason is not the complications with invalidations, but that this would store a full additional copy of the constraint data, which we do not want.

The best practice that i think we should push users to is to freeze constraints to CSRConstraints unless they explicitly need the old Constraint format, which makes the matrix accessor a ms concatenation.

EDIT: Sorry, i was dumb

Comment thread linopy/expressions.py
Comment on lines +966 to +967
M = m.matrices
sol = pd.Series(M.sol, M.vlabels)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the change that we would want to apply everywhere!

@coroa
Copy link
Copy Markdown
Member

coroa commented Jun 1, 2026

Ups .. sorry, should have properly read the PR. I just read Fabian's comment and made assumptions.

You are explicitly not trying to make matrices a cached_property, but only the objective wise components.

I don't care, that is a very minor issue as your benchmarking shows 1ms -> 0.4 micros. . That's not worth optimizing for.

But you are completely correct about the unrelated fix in map_solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants