Benchmarks¶
Li ships a reproducible cross-language benchmark harness under benchmarks/. Every tier-2 physics simulation uses one shared C kernel per benchmark; cpp, rust, and julia labels run identical machine code, while li links the same .c file through lic (LI_EXTRA_C).
Modular scope (package-scoped runs)¶
Changing one workspace package should not rerun the full tier-1/2 matrix.
| Tool | Purpose |
|---|---|
benchmarks/manifest.toml | Maps packages/* → benchmark ids, composables, hooks |
benchmarks/harness/bench_scope.py | Resolve scope: --package, --changed, --print-benches |
./scripts/bench-package.sh | Run scoped verify, composables, optional --timing |
bench.py --only / --package / --changed | Merge timings for subset into latest.csv |
bench_sim.py | Sim registry + summaries + selective tier-2 smokes |
./scripts/bench-package.sh li-sim-scientific --write-summary
./scripts/bench-package.sh --changed --timing --runs 1
python3 benchmarks/harness/bench.py --tier 2 --only md_lennard_jones,heat_equation_2d
Sim agents: sim-agent-handoff.md.
Tiers¶
| Tier | Scope | CI gate |
|---|---|---|
| 0 | li-tests + verify.py tier-0 lic build + MD stability (strict) | ./scripts/ci.sh |
| 1 | Micro kernels (simd_dot, matmul, horner) | ./scripts/ci-bench.sh |
| 2 | Physics sims (MD, N-body, wave, heat, pendulum) | manual / weekly workflow |
| 3 | HTTP / li-httpd vs nginx, apache, node, … | benchmarks/scripts/run-full-benchmark-suite.sh |
| 4 | HTTP exploit grid (CVE-style harness) | same suite (SKIP_EXPLOITS=1 to skip) |
| 5 | Ecosystem: compile time, lip/lit smoke, security timing | deferred — RUN_TIER5_ECOSYSTEM=1 |
Stability vs speed¶
- Throughput rows use low-density lattice params (
params.toml[perf]). - Conservation rows use
md_stress.cat liquid density ρ=0.75. - Strict gates: harmonic oscillator (Swope) + momentum invariant.
- Advisory gates: Allen–Tildesley energy MSD + timestep halving (not CI-blocking).
./scripts/check-bench-harness-contract.sh # verify/timing contract + tier-0 smoke (needs lic)
python3 benchmarks/harness/verify.py # tier-0 only (same smokes as bench --tier 0)
python3 benchmarks/harness/stability.py
python3 benchmarks/harness/bench.py --tier 12 --runs 3 --skip-verify
# Optional when ecosystem benches are ready:
# RUN_TIER5_ECOSYSTEM=1 python3 benchmarks/harness/bench_ecosystem.py --runs 3
./scripts/plot_shareables.sh
Dashboard (li-langverse/benchmarks)¶
CI on lic dev/main uploads benchmarks/results/latest.csv (+ stability.csv, security.csv) and dispatches ingest. Add new benchmark ids to the org catalog.toml — see benchmarks-catalog-additions.toml for tier-3 rows (async, effects, security).
Published site: li-langverse.github.io/benchmarks.
See the benchmarks implementation plan for the full matrix and publication workflow.