logoalt Hacker News

credit_guytoday at 1:11 AM0 repliesview on HN

Here's what's important about this paper. It is written by AMD researchers. It shows AMD is investing in AI research. Is this the same level of achievement as DeepSeek 3.2. Most likely not. Do they have novel ideas? Difficult to say, there are hundreds of new ideas being tried in this space. Is this worthless? Most certainly not. In order to make progress in this domain (as in any other), you first need to get your feet wet. You need to play with the various components, and see how they fit together. The idea in this paper is that you can combine somehow SSMs (like Mamba) and LLMs (like LLama). The examples they give are absolute toys compared to DeepSeek 3.2 (the largest is 8 billion parameters, while DeepSeek 3.2 has 671 billion parameters). The comparison you are trying to make simply does not apply. The good news for all of us is that AMD is working in this space.