Instead of doing two passes of all candidates using map then a filter,
this uses `filter_map` so we are only doing one pass for the candidates.
This alone is quite a significant improvement of ~7%
Output of `./scripts/bench 0.x`
Benchmark 1: 0.x
Time (mean ± σ): 2.373 s ± 0.138 s [User: 10.617 s, System: 1.697 s]
Range (min … max): 2.124 s … 2.577 s 10 runs
Benchmark 2: HEAD
Time (mean ± σ): 2.206 s ± 0.133 s [User: 10.061 s, System: 1.811 s]
Range (min … max): 1.940 s … 2.433 s 10 runs
Summary
HEAD ran
1.08 ± 0.09 times faster than 0.x
-------------------------------------
The percentage difference is -7.00%
-------------------------------------
Move some of the iteration in to loa and access the values by the index
to reduce the number of loops we need todo to get items into teh results
buffer.
Currently the flow is:
1) Filter and sort the candidates in rust
2) Convert to a string and pass to lua
3) Split the string and add them as lines in a buffer in lua
Now the flow is:
1) Filter and sort the candidates in rust
2) Loop over an iterator in lua
3) Pass each item to lua as a pointer by the index
This removes quite a bit of the work that is needed to get the data into
lua as a table. We are first removing the loop that will join the
results vector into one string. Then we will remove the copy of this
string into lua. We will then finally remove the loop to split the
string and create a table from it in lua. All of this ends up in a 12%
speed up.
Output for `./scripts/bench 0.x`
Benchmark 1: HEAD
Time (mean ± σ): 2.667 s ± 0.065 s [User: 8.537 s, System: 1.420 s]
Range (min … max): 2.588 s … 2.767 s 10 runs
Benchmark 2: 0.x
Time (mean ± σ): 2.337 s ± 0.150 s [User: 9.564 s, System: 1.648 s]
Range (min … max): 2.161 s … 2.529 s 10 runs
Summary
HEAD ran
1.14 ± 0.08 times faster than 0.x
-------------------------------------
The percentage difference is -12.00%
-------------------------------------
once_cell has now been merged into rust core. This removes the
lazy_static dependency and migrates over to the built in `OnceLock`.
Its always good to remove dependencies where possible, this also give us
a preference for the built in `OnceLock`
```
Benchmark 1: chore: add benchmark for `set_items`
Time (mean ± σ): 6.327 s ± 0.199 s [User: 15.316 s, System: 1.323 s]
Range (min … max): 6.087 s … 6.712 s 10 runs
Benchmark 2: refactor: remove lazy_static
Time (mean ± σ): 6.171 s ± 0.251 s [User: 15.223 s, System: 1.382 s]
Range (min … max): 5.910 s … 6.776 s 10 runs
Summary
'refactor: remove lazy_static' ran
1.03 ± 0.05 times faster than 'chore: add benchmark for `set_items`'
```
This adds dot files into the finder. We are adding the overrides to the
`ignore` package, that can be used later to add custom ignore
directories that can be passed in as settings.
We are also adding a new `ivy_cwd` function to libivy to get the current
directory due to the limitations of lua.
Fixes-issue: #16
- Update the provided `minimum_score` in `sorter::Option::new` to match
what was being used in `sort_strings`
- Use the `minimum_score` value instead of a hardcoded number
This seems like functionality that was either intended and not added, or
added and then part removed. Either way the performance impact is
minimal and it's a nice idea.
- For completeness, but also for additional performance when there are
extremely large numbers of results, use `par_sort_unstable_by()` for
sorting the results. For most sane result sets this will not represent
a significant speedup (for the Kubernetes benchmark it's around 1%)
but as the set to be sorted grows the impact would be larger.
- Use `into_par_iter()` before setting out to calculate scores and then
filter by them
This represents a more efficient parallelism approach, with no mutex
or global state at top level.
ivy_files(kubernetes) time: [4.5800 ms 4.6121 ms 4.6467 ms]
change: [-55.056% -54.570% -54.133%] (p = 0.00 < 0.05)
Performance has improved.
ivy_match(file.lua) time: [1.1514 µs 1.1599 µs 1.1694 µs]
change: [+0.4116% +2.0753% +3.6710%] (p = 0.01 < 0.05)
Change within noise threshold.
The relevant processes are so fast, mutexes and mutex locks are so
expensive, and iterators so efficient, that it's actually faster to run
single-threaded across all the data than to spin up a bunch of threads
and have them basically spinlock waiting for the global mutex involved
either directly or in a channel.
ivy_files(kubernetes) time: [10.209 ms 10.245 ms 10.286 ms]
change: [-36.781% -36.178% -35.601%] (p = 0.00 < 0.05)
Performance has improved.
ivy_match(file.lua) time: [1.1626 µs 1.1668 µs 1.1709 µs]
change: [+0.2131% +1.5409% +2.9109%] (p = 0.02 < 0.05)
Change within noise threshold.
- Use an async (i.e. unlimited buffer) MPSC channel instead of an
Arc<Mutex<Vec>> for storing the scored matches in Sorter
- Use Arc<Matcher> instead of Arc<Mutex<Matcher>> for the matcher, as
it's not mutated and appears to be threadsafe.
This cuts average iteration time (on the benchmarked machine) from
25.98ms to 16.08ms for the ivy_files benchmark.