It'd also have been interesting to see some overall profiling data of the initial program & some discussion of which optimisations to investigate based on that profiling data.
When investigating performance issues its often very helpful to run with profiling instrumentation enabled and start by looking at some top-down "cumulative sum" profiler output to get a big picture view of which functions/phases are consuming most of the running time, to see where it may be worth spending some effort.
Getting familiar with linux's perf [1] tool is also helpful, both in terms of interpreting summary statistics from perf stat (instructions per cycle, page faults, cache misses, etc) that can give clues what to focus on, but also being able to use it to annotate source line by line with time spent.
I'm not familiar with rust, but e.g. the rustc compiler dev guide has a tutorial on how to profile rustc using perf [2]
[1] Brendan Gregg's Linux perf examples is an excellent place to start https://www.brendangregg.com/perf.html [2] https://rustc-dev-guide.rust-lang.org/profiling/with_perf.ht...