What about using a Modelfile for ollama that tweaks the context window size? I seem to remember parameters for that in the ollama GitHub docs.
I applied (for now) a pre-filled table with a 4096 default limit. Users can also specify an upper or lower limit from the UI directly now. Added chunk and recursive summarization too.
I applied (for now) a pre-filled table with a 4096 default limit. Users can also specify an upper or lower limit from the UI directly now. Added chunk and recursive summarization too.