> for simple string literals it will definitely lose to Hyperscan and Rust regex since they have a high effort left-to-right SIMD algorithm that we cannot easily use
I think "simple string literals" undersells it. I think that description works for engines like RE2 or Go's regex engine, but not Hyperscan or Rust regex. (And I would put Hyperscan in another category than even Rust regex.) Granted, it is arguably difficult to be succinct here since it's a heuristic with difficult-to-predict failure points. But something like: "patterns from which a small number of string literals can be extracted."
yes, that is correct. also Rust's engine matches the full unicode spec as individual characters, whereas .NET's will chop emojis into two sometimes, so Rust at a disadvantage here.
something i've been also wondering is how does Harry (https://ieeexplore.ieee.org/document/10229022) compare to the Teddy algorithm, it's written by some of the same authors - i wonder if it's used in any engines outside of Hyperscan today.