This is possible but not for training but fine-tuning the existing open source models.
This can be mainstream, and then custom model fine-tuning becomes the new “software development”.
Please check out this new fine-tuning method for LLM by MIT and ETH Zurich teams that used a single NVIDIA H200 GPU [1], [2], [3].
Full fine-tuning of the entire model’s parameters were performed based on the Hugging Face TRL library.
[1] MIT's new fine-tuning method lets LLMs learn new skills without losing old ones (news):
https://venturebeat.com/orchestration/mits-new-fine-tuning-m...
[2] Self-Distillation Enables Continual Learning (paper):
https://arxiv.org/abs/2601.19897
[3] Self-Distillation Enables Continual Learning (code):