Skip to content

ASV Benchmarks Integration#209

Open
vchamarthi wants to merge 5 commits into
IntelPython:mainfrom
vchamarthi:asv-benchmarks
Open

ASV Benchmarks Integration#209
vchamarthi wants to merge 5 commits into
IntelPython:mainfrom
vchamarthi:asv-benchmarks

Conversation

@vchamarthi
Copy link
Copy Markdown

Adds an ASV benchmark suite to track mkl_umath performance over time.

Benchmarks

micro/ - Single-ufunc timing benchmarks across
dtype - {float32, float64} × size - {10K, 100K, 1M}.
Arrays are pre-allocated in setup() and reused across timing calls.

File Ufuncs
bench_trig.py sin, cos, tan, arcsin, arccos, arctan, arctan2, sinh, cosh, tanh
bench_exp_log.py exp, exp2, expm1, log, log2, log10, log1p
bench_sqrt_misc.py sqrt, cbrt, square, fabs, absolute, reciprocal

npbench/ - 14 application-level workloads adapted from the npbench benchmark suite
(kernels inlined, no external dependency). Each runs at preset - {M, L}. All use
setup_cache() so expensive array initialization runs once per commit, not once per
timing repeat.

Patch script

_patch_setup.py - Runs once per ASV worker process at package import. Applies
mkl_fft, mkl_random, and mkl_umath patches via their public APIs and hard-fails
with a descriptive RuntimeError if any patch does not take effect. Benchmarks can
never silently fall back to stock NumPy.

Comment thread benchmarks/benchmarks/npbench/bench_cholesky2.py Outdated
Comment thread benchmarks/benchmarks/micro/bench_exp_log.py Outdated
Comment thread benchmarks/benchmarks/_patch_setup.py Outdated
Comment thread benchmarks/benchmarks/_patch_setup.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_k3mm.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_k2mm.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_gesummv.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_gemver.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_gemm.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_doitgen.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_correlation.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_covariance.py Outdated
Comment thread benchmarks/benchmarks/npbench/bench_deriche.py Outdated
params = (
sorted(_UFUNC_CONFIGS.keys()),
["float32", "float64"],
[10_000, 100_000, 1_000_000],
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ndgrigorian Do you think these sizes are good enough?
on pvc machine Intel Xeon Platinum 8480+, 1M looks solid L3-resident (L3 cache size on this machine is 210 MiB (2 instances))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants