This part:
> On top of that, we integrate a dataset powered by man pages and command outputs to ensure you get the correct command first.
Also makes it sound like they're "providing that dataset", rather than generating that from the users computer. Wouldn't that mean it's potentially a mismatch between various versions of the software available? Not to mention some OSes will have a different version of some software available compared to others, how does it deal with those situations if they're shipping a dataset?
There is no way is it not generated on user computer.
"get the correct command first" and "shipping a [external] dataset" are incompatible.