anything that doesn't touch the model parameters at all once it has been compiled. for example, in streaming ASR of an encoder-decoder you can get gains in accuracy just by enhancing the encoder-decoder orchestration and ratio, frequency of fwd passes, dynamically adjusting the length of rolling windows (if using full attention). Prompting would be part of this too, including few-shot examples. Decoding strategy is also part of this (top-k, nucleus, speculative decoding, greedy or anything else). Applying signal processing or any kind of processing to the input before getting it into the model, or to the output. There are a lot of things you can do.
Also think about the program-synthesis approach proposed by Poetiq.ai. python programs are being generated and evaluated against previous examples. Then in-context learning is done programmatically via prompt concatenation. If you can "score" online the working and non working examples, then you have a very strong reward signal.