> a hand tuned input loop written in C that takes <<1ms
Yes, I would certainly expect much less than 1ms. Perhaps 1µs should be the goal?