logoalt Hacker News

int_19h01/20/20251 replyview on HN

In what sense is Mistral a copy of LLaMA, specifically?


Replies

rvnx01/21/2025

https://x.com/arthurmensch/status/1752737462663684344?s=46

This is one message of the founders of Mistral when they accidentally leaked one work-in-progress version that was a fine-tune of LLaMA, and there are few hints for that.

Like:

> What is the architectural difference between Mistral and Llama? HF Mistral seems the same as Llama except for sliding window attention.

So even their “trained from scratch” models like 7B aren’t that impressive if they just pick the dataset and tweak a few parameter.

show 1 reply