Instead of doing two passes of all candidates using map then a filter,
this uses `filter_map` so we are only doing one pass for the candidates.
This alone is quite a significant improvement of ~7%
Output of `./scripts/bench 0.x`
Benchmark 1: 0.x
Time (mean ± σ): 2.373 s ± 0.138 s [User: 10.617 s, System: 1.697 s]
Range (min … max): 2.124 s … 2.577 s 10 runs
Benchmark 2: HEAD
Time (mean ± σ): 2.206 s ± 0.133 s [User: 10.061 s, System: 1.811 s]
Range (min … max): 1.940 s … 2.433 s 10 runs
Summary
HEAD ran
1.08 ± 0.09 times faster than 0.x
-------------------------------------
The percentage difference is -7.00%
-------------------------------------
The relevant processes are so fast, mutexes and mutex locks are so
expensive, and iterators so efficient, that it's actually faster to run
single-threaded across all the data than to spin up a bunch of threads
and have them basically spinlock waiting for the global mutex involved
either directly or in a channel.
ivy_files(kubernetes) time: [10.209 ms 10.245 ms 10.286 ms]
change: [-36.781% -36.178% -35.601%] (p = 0.00 < 0.05)
Performance has improved.
ivy_match(file.lua) time: [1.1626 µs 1.1668 µs 1.1709 µs]
change: [+0.2131% +1.5409% +2.9109%] (p = 0.02 < 0.05)
Change within noise threshold.