All good, cheers! | alt Hacker News

streamer45 • yesterday at 5:20 PM • 1 reply • view on HN

All good, cheers!

Per the RAM comment, you may able to get it run locally with two tweaks:

1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM

2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations

➕ show 1 reply