Maybe look into model finetuning/distilation. Unsloth [1] has great guides and provides everything you need to get started on Google Colab for free. [1] https://unsloth.ai/