Take a look at https://chatjimmy.ai/ -- it's running against Taalas' "hardcore" silicon model, ie a dedicated, ASIC-like chip.
Wow - actually pretty astonishing how fast their inference is. So fast it feels fake?
Wow - actually pretty astonishing how fast their inference is. So fast it feels fake?