Well I don’t have 8xH100s, but if I do, I’m probably not gonna donate it a VC-funded company. Remember “Open”AI?
> Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.
So, your garden-variety $0.5M desktop PC, then.
Cool, cool.
[1] https://viperatech.com/shop/nvidia-dgx-h100-p4387-system-640...
me: Oh cool, a project like Folding@Home but for AI compute, maybe I'll contribute as we-
> Decentralized training of INTELLECT-1 currently requires 8x H100 SXM5 GPUs.
me: and for that reason, I'm out
Also they state that later they will be adding the ability for you to contribute your own compute but how will they solve the problem of having to back-propagate to all of the remote nodes contributing to the project without egregiously slow training time?
Not exactly what I would call decentralized training. More like distributed through multiple data centers.
Decentralized training would be when you can use consumer GPUs, but that's not likely to work with backpropagation directly, but maybe with one of the backpropagation approximating algorithms.
But I can already train from 30 different vendors distributed across the US, why do I need to use a “decentralized” training system? Decentralized inferercing makes more sense as that is where things can be censored
> solve decentralized training step-by-step to ensure AGI will be open-source, transparent, and accessible
One hell of an uncited leap from "we're multiplying a lot of numbers" to "AGI", as if it is a given
This is cool work, I’ve been watching the slow evolution of this space for a couple years and it feels like a good way we can ensure AI is owned and accessible to everyone.
For some purposes a decentrally trained, open source LLM could be just fine? E.g. you want a stochastic parrot that is trained on a large, general purpose corpus of genuine public domain / creative commons content. Having such a tool widely available is still a quantum leap versus Lore Ipsum. Up to point you can take your time. There is no manic race to capitalize any hype. "slow open AI" instead of "fast closed AGI". Helpfully, the nature of the target corpus does not change every day. You can imagine, e.g., annual revisions, trained and rolled-out leisurely. Both costs and benefits get widely distributed.
My initial was quite negative, but having thought it through, I can see the logic in this. Having open models is better than closed models. That said, this page seems like a joke. Someone drank a little too much AI-koolaid methinks.
Decentralised but very high entry barrier.
The main benefit of this type of decentralization seems to be minimizing the node cost. One can rent the cheapest nodes to use in the system. Even the temporary instances can be replaced with others. It’s also easy for system owners to donate time.
So, mostly cost reduction mixed with some cloud, vendor diversity.
So just spitballing here but this is likely a souped-up reverse engineered DisTrO [0] under the hood, right? Or could it be something else?
> We quantize the pseudo-gradients to int8, reducing communication requirements by 400x.
Can someone explain if it does reduce the model quality overall?
Yea, come back when you can do this on BOINC.
> Prime Intellect
Ah, yes, Prime Intellect, the AGI that went foom and genocided the universe because it was commanded to preserve human civilization without regard for human values. A strong contender for the least evil hostile superintelligence in fiction. What a wonderful thing to name your AI startup after. What's next, creating the Torment Nexus?
(my position on the book as a whole is more complex, but... really? Really?)
A lot of comment are sneering at various aspects of this press release, and yeah, there's some cringeworthy stuff.
But the technical aspects are pretty cool:
- Fault-tolerant training where nodes and be added and removed mid-run without interrupting the other nodes.
- Sending quantized gradients during the synchronization phase.
- (In the OpenDiLoCo article) Async synchronization.
They're also mentioning potential trustless systems where everyone can contribute compute, which would make this a truly decentralized open platform. Overall it'll be pretty interesting to see where this goes!