What's your workflow like? I use AI Chat. I load Qwen2.5-1.5B-Instruct with llama.cpp server, f...

XMasterrrr • 01/21/2025 • 0 replies • view on HN

What's your workflow like? I use AI Chat. I load Qwen2.5-1.5B-Instruct with llama.cpp server, fully offloaded to the CPU, and then I config AI Chat to connect to the llama.cpp endpoint.

alt Hacker News