I've found that for the most part the articles that I want summarized are those which only fit ...

RicoElectrico • 10/11/2024 • 4 replies • view on HN

I've found that for the most part the articles that I want summarized are those which only fit the largest context models such as Claude. Because otherwise I can skim-read the article possibly in reader mode for legibility.

Is llama 2 a good fit considering its small context window?

Replies

tcsenpai • 10/11/2024

Personally I use llama3.1:8b or mistral-nemo:latest which have a decent contex window (even if it is less than the commercial ones usually). I am working on a token calculator / division of the content method too but is very early

➕ show 1 reply

reissbaker • 10/12/2024

I don't think this is intended for Llama 2? The Llama 3.1 and 3.2 series have very long context windows (128k tokens).

tempodox • 10/12/2024

What about using a Modelfile for ollama that tweaks the context window size? I seem to remember parameters for that in the ollama GitHub docs.

➕ show 1 reply

htrp • 10/13/2024

do multi stage summarization?

➕ show 1 reply

alt Hacker News

Replies