logoalt Hacker News

RicoElectrico10/11/20244 repliesview on HN

I've found that for the most part the articles that I want summarized are those which only fit the largest context models such as Claude. Because otherwise I can skim-read the article possibly in reader mode for legibility.

Is llama 2 a good fit considering its small context window?


Replies

tcsenpai10/11/2024

Personally I use llama3.1:8b or mistral-nemo:latest which have a decent contex window (even if it is less than the commercial ones usually). I am working on a token calculator / division of the content method too but is very early

show 1 reply
reissbaker10/12/2024

I don't think this is intended for Llama 2? The Llama 3.1 and 3.2 series have very long context windows (128k tokens).

tempodox10/12/2024

What about using a Modelfile for ollama that tweaks the context window size? I seem to remember parameters for that in the ollama GitHub docs.

show 1 reply
htrp10/13/2024

do multi stage summarization?

show 1 reply