```
cargo bench --bench run_criterion
+# run a particular criterion benchmark
+cargo bench --bench run_criterion -- <benchmark_name>
+
# to run iai, you need valgrind installed and to install iai-callgrind-runner
# at the same version as is in Cargo.toml:
cargo install iai-callgrind-runner --version 0.7.3
## Setup
-`setup.rs` contains the setup code for the actual benchmarks, which are run
-using the `Benches` struct. `fn benches()` at the top of the file is where the
-benchmarks are defined.
+`setup.rs` contains the setup code to run benchmarks. `fn prolog_benches()` at
+the top of the file is where the benchmarks are defined.
-Benchmarks are organized around running queries against one prolog module file.
-Before any runs start, `Benches::new()` reads the module files and initializes a
-new `scryer_prolog::machine::Machine` for each file; multiple queries can be
-declared to be benchmarked in the context of that module/machine instance.
+Benchmarks are organized around running queries against a prolog module file.
+Before a benchmark starts, `benchmark.setup()` is called which reads the module
+file and initializes a new `scryer_prolog::machine::Machine`.
Each benchmark measurement is done by running a query against the machine. In
the case of criterion each query is run many times, in the case of iai it's run
* The goal of benchmarking is to know if a library or engine change improved
performance or not.
-* Once a benchmark is defined and named, don't change it's definition. If a
- benchmark needs to change to be more useful, give the new definition a new
- name. This will prevent charts from showing wild changes in performance just
- because the definition changed (see previous).
-* Aim for queries to execute in about 0.1-0.5s realtime. Longer runtimes make it
+* Once a benchmark is defined and named, avoid changing it's definition. In
+ general, if a benchmark needs to change to be more useful, give the new
+ definition a new name. This will prevent charts from showing wild changes in
+ performance just because the definition changed (see previous).
+* Aim for queries to execute in less than 0.5s realtime. Longer runtimes make it
easier for humans to see big differences, but benchmarks either run 10x slower
(iai) or execute repeatedly to attain statistical significance (criterion) and
- in both cases queries that take 5+ seconds quickly become unweildly.
+ in both cases queries that take longer become cumbersome to run.
* Consider that the library runtime actually parses the text output of the top
- level. So keep the output small and don't use custom outputs or it will fail
- to parse.
+ level. So don't use custom outputs or it will fail to parse. Also keep the
+ output small so it doesn't just benchmark the ouput parsing code.
* DO test the output of the benchmark run, we don't want to count broken
benchmarks.
* Because a query may run against the same machine multiple times, don't
- [ ] Currently, the execution time to load a module is not benchmarked. It
would be nice to have at least one benchmark for loading a module (probably a
big one).
-- [ ] Adjust the benchmark execution strategy to allow queries to modify the
- engine state (`assertz` etc).
- [ ] Write a new action that consumes the test and benchmark results and plots
them over time and publishes a report (github pages?).
use std::{collections::BTreeMap, fs, path::Path};
-use criterion::{black_box, Criterion};
-
use maplit::btreemap;
use scryer_prolog::machine::{
parsed_results::{QueryMatch, QueryResolution, Value},
Machine,
};
-pub fn benches() -> Benches {
- Benches::new(&[
+pub fn prolog_benches() -> BTreeMap<&'static str, PrologBenchmark> {
+ [
+ (
+ "count_edges", // name of the benchmark
+ "benches/edges.pl", // name of the prolog module file to load
+ "independent_set_count(aa, Count).", // query to benchmark in the context of the loaded module
+ Strategy::Reuse,
+ btreemap! { "Count" => Value::try_from("211954906".to_string()).unwrap(), }, // list of expected bindings
+ ),
(
- "benches/edges.pl", // name of the prolog module file to load
- &[
- (
- "count_edges_short", // name of the benchmark
- "independent_set_count(ky, Count).", // query to benchmark in the context of the loaded module
- btreemap! { "Count".to_string() => Value::try_from("2869176".to_string()).unwrap() }, // List of expected bindings
- ),
- (
- "count_edges", // multiple benchmark queries can be defined per module
- "independent_set_count(aa, Count).", // consider making the query adjustable to tune the runtime
- btreemap! { "Count".to_string() => Value::try_from("211954906".to_string()).unwrap(), },
- ),
- ],
+ "count_edges_short",
+ "benches/edges.pl", // use the same file in multiple benchmarks
+ "independent_set_count(ky, Count).", // consider making the query adjustable to tune the run time to ~0.1s
+ Strategy::Reuse,
+ btreemap! { "Count" => Value::try_from("2869176".to_string()).unwrap() },
),
(
+ "numlist_short",
"benches/numlist.pl",
- &[(
- "numlist_short",
- "run_numlist(1000000, Head).",
- btreemap! { "Head".to_string() => Value::try_from("1".to_string()).unwrap()},
- )],
+ "run_numlist(1000000, Head).",
+ Strategy::Reuse,
+ btreemap! { "Head" => Value::try_from("1".to_string()).unwrap()},
),
- ])
+ ]
+ .map(|b| {
+ (
+ b.0,
+ PrologBenchmark {
+ name: b.0,
+ filename: b.1,
+ query: b.2,
+ strategy: b.3,
+ bindings: b.4,
+ },
+ )
+ })
+ .into()
}
-pub struct Benches {
- machines: Vec<Machine>,
- runs: BTreeMap<String, Run>,
+pub enum Strategy {
+ #[allow(dead_code)]
+ Fresh,
+ Reuse,
}
-pub struct Run {
- machine_idx: usize,
- name: &'static str,
- query: &'static str,
- bindings: BTreeMap<String, Value>,
+pub struct PrologBenchmark {
+ pub name: &'static str,
+ pub filename: &'static str,
+ pub query: &'static str,
+ pub strategy: Strategy,
+ pub bindings: BTreeMap<&'static str, Value>,
}
-// Required for using a mutex. It doesn't actually send anything across threads,
-// and this is just a benchmark, so it Should Be Fine(tm). ¯\_(ツ)_/¯
-unsafe impl Send for Benches {}
-
-impl Benches {
- #[allow(clippy::type_complexity)]
- pub fn new(
- benches: &[(
- &'static str,
- &[(&'static str, &'static str, BTreeMap<String, Value>)],
- )],
- ) -> Self {
- let mut machines = vec![];
- let mut runs = BTreeMap::new();
+impl PrologBenchmark {
+ pub fn setup(&self) -> impl FnMut() {
+ let program = fs::read_to_string(self.filename).unwrap();
+ let module_name = Path::new(self.filename)
+ .file_stem()
+ .and_then(|s| s.to_str())
+ .unwrap();
- for b in benches {
- let content = fs::read_to_string(b.0).unwrap();
- let name = Path::new(b.0).file_stem().unwrap().to_str().unwrap();
- let mut machine = Machine::new_lib();
- machine.load_module_string(name, content);
- machines.push(machine);
- let idx = machines.len() - 1;
- runs.extend(b.1.iter().cloned().map(|r| {
- (
- r.0.to_string(),
- Run {
- machine_idx: idx,
- name: r.0,
- query: r.1,
- bindings: r.2,
- },
- )
- }));
- }
+ let mut machine = Machine::new_lib();
+ machine.load_module_string(module_name, program);
- Benches { machines, runs }
- }
+ let benchmark_name = self.name;
+ let query = self.query;
+ let expected = QueryResolution::Matches(vec![QueryMatch::from(self.bindings.clone())]);
- #[allow(dead_code)]
- pub fn run_all_criterion(&mut self, c: &mut Criterion) {
- for (_, runner) in self.runs.iter() {
- let machine = &mut self.machines[runner.machine_idx];
- c.bench_function(runner.name, |b| {
- b.iter(|| {
- Self::run(machine, runner);
- })
- });
+ move || {
+ use criterion::black_box;
+ let result = black_box(machine.run_query(black_box(query.to_string())));
+ match result {
+ Ok(r) => assert_eq!(&r, &expected),
+ Err(e) => panic!("benchmark {} failed with: {}", benchmark_name, e),
+ }
}
}
-
- #[allow(dead_code)]
- pub fn run_once(&mut self, name: &str) {
- let runner = &self.runs[name];
- let machine = &mut self.machines[runner.machine_idx];
- Self::run(machine, runner);
- }
-
- fn run(machine: &mut Machine, runner: &Run) {
- assert_eq!(
- black_box(machine.run_query(black_box(runner.query.to_string()))),
- Ok(QueryResolution::Matches(vec![QueryMatch::from(
- runner.bindings.clone()
- )]))
- );
- }
}