Looking at the English keyboard and the English digraphs, it doesn't seem like the coverage is that well optimized. We are currently capturing 8.65% of the digraph weight, but just getting the top-5 would account for 5% by itself.
I also feel like distance travelled is the wrong (or an incomplete) metric. Change in direction seems like a good proxy for mental or physical effort. To take it to an extreme, I'd be very satisfied with a keyboard that had me move my thumb in a circle as on the original iPod, provided it just read my mind and inputted the right text. That's extreme distance but little effort.
https://pi.math.cornell.edu/%7Emec/2003-2004/cryptography/su...
See also: https://en.wikipedia.org/wiki/Typewise
+---------+---------------+-----------+-------------------------------------+
| Digraph | Frequency (%) | Adjacent? | Pair on Keyboard |
+---------+---------------+-----------+-------------------------------------+
| TH | 1.52 | Yes | T is right of H |
| HE | 1.28 | No | Separated by O and [Space] |
| IN | 0.94 | Yes | I is top-left of N |
| ER | 0.94 | Yes | E is below R |
| AN | 0.82 | No | A is bottom-center; N is top-right |
| RE | 0.68 | Yes | R is above E |
| ND | 0.63 | No | N is top-right; D is bottom-right |
| AT | 0.59 | No | Separated by [Space] and S |
| ON | 0.57 | No | Separated by H and T |
| NT | 0.56 | Yes | N is top-right of T |
| HA | 0.56 | No | Separated by [Space] |
| ES | 0.56 | No | Separated by [Space] |
| ST | 0.55 | Yes | S is below T |
| EN | 0.55 | No | N/E are on opposite sides |
| ED | 0.53 | No | E is center-left; D is bottom-right |
| TO | 0.52 | No | Separated by H |
| IT | 0.50 | Yes | I is above T |
| OU | 0.50 | Yes | O is below U |
| EA | 0.47 | Yes | E is top-left of A |
| HI | 0.46 | Yes | H is below-left of I |
| IS | 0.46 | No | Separated by T |
| OR | 0.43 | Yes | O is below R |
| TI | 0.34 | Yes | T is below I |
| AS | 0.33 | Yes | A is below-left of S |
| TE | 0.27 | No | Separated by H and [Space] |
| ET | 0.19 | No | Separated by H and [Space] |
| NG | 0.18 | Yes | N is above G |
| OF | 0.16 | Yes | O is below F |
| AL | 0.09 | Yes | A is right of L |
| DE | 0.09 | No | E/D are distant |
+---------+---------------+-----------+-------------------------------------+I agree that distance is not a great metric. The maximum travel distance on a smartphone screen is already tiny. I'd say the best metric is accuracy or lack of amibiguity, something like average confidence level that any given swipe means a particular word and not another. (This is assuming swipe-based word entry, which I much prefer to anything tap-based.)
> distance travelled is the wrong (or an incomplete) metric.
Indeed, most of these keyboard algorithms use only plausible useful metrics and only plausible real text (like, how many designs account for the fact that you make typos and need to correct them, is backspace location accounted for? What about symbols?)