You're supposed to use a cheap ChatGPT subscription to run optimization loops over llama.cpp flags with a self-contained reproducible benchmark script and just let it burn for hours/days until it is fully optimized ))))