Instead of doing two passes of all candidates using map then a filter,
this uses `filter_map` so we are only doing one pass for the candidates.
This alone is quite a significant improvement of ~7%
Output of `./scripts/bench 0.x`
Benchmark 1: 0.x
Time (mean ± σ): 2.373 s ± 0.138 s [User: 10.617 s, System: 1.697 s]
Range (min … max): 2.124 s … 2.577 s 10 runs
Benchmark 2: HEAD
Time (mean ± σ): 2.206 s ± 0.133 s [User: 10.061 s, System: 1.811 s]
Range (min … max): 1.940 s … 2.433 s 10 runs
Summary
HEAD ran
1.08 ± 0.09 times faster than 0.x
-------------------------------------
The percentage difference is -7.00%
-------------------------------------
- Update the provided `minimum_score` in `sorter::Option::new` to match
what was being used in `sort_strings`
- Use the `minimum_score` value instead of a hardcoded number
This seems like functionality that was either intended and not added, or
added and then part removed. Either way the performance impact is
minimal and it's a nice idea.
- For completeness, but also for additional performance when there are
extremely large numbers of results, use `par_sort_unstable_by()` for
sorting the results. For most sane result sets this will not represent
a significant speedup (for the Kubernetes benchmark it's around 1%)
but as the set to be sorted grows the impact would be larger.
- Use `into_par_iter()` before setting out to calculate scores and then
filter by them
This represents a more efficient parallelism approach, with no mutex
or global state at top level.
ivy_files(kubernetes) time: [4.5800 ms 4.6121 ms 4.6467 ms]
change: [-55.056% -54.570% -54.133%] (p = 0.00 < 0.05)
Performance has improved.
ivy_match(file.lua) time: [1.1514 µs 1.1599 µs 1.1694 µs]
change: [+0.4116% +2.0753% +3.6710%] (p = 0.01 < 0.05)
Change within noise threshold.
The relevant processes are so fast, mutexes and mutex locks are so
expensive, and iterators so efficient, that it's actually faster to run
single-threaded across all the data than to spin up a bunch of threads
and have them basically spinlock waiting for the global mutex involved
either directly or in a channel.
ivy_files(kubernetes) time: [10.209 ms 10.245 ms 10.286 ms]
change: [-36.781% -36.178% -35.601%] (p = 0.00 < 0.05)
Performance has improved.
ivy_match(file.lua) time: [1.1626 µs 1.1668 µs 1.1709 µs]
change: [+0.2131% +1.5409% +2.9109%] (p = 0.02 < 0.05)
Change within noise threshold.
- Use an async (i.e. unlimited buffer) MPSC channel instead of an
Arc<Mutex<Vec>> for storing the scored matches in Sorter
- Use Arc<Matcher> instead of Arc<Mutex<Matcher>> for the matcher, as
it's not mutated and appears to be threadsafe.
This cuts average iteration time (on the benchmarked machine) from
25.98ms to 16.08ms for the ivy_files benchmark.