logoalt Hacker News

chriskananlast Thursday at 8:05 PM7 repliesview on HN

The most salient thing in the document is that it put export controls on releasing the weights of models trained with 10^26 operations. While there may be some errors in my math, I think that corresponds to training a model with over 70,000 H100s for a month.

I personally think the regulation is misguided, as it assumes we won't identify better algorithms/architectures. There is no reason to assume that the level of compute leads to these problems.

Moreover, given the emphasis on test-time compute nowadays and that it seems like a lot of companies have hit a wall with performance gains with trying to scale LLMs at train-time, I especially think this regulation isn't especially meaningful.


Replies

parsimo2010last Thursday at 8:59 PM

Traditional export control applied to advanced hardware is because the US doesn't want its adversaries to have access to things that erode the US military advantage. But most hardware is only controlled at the high-end of the market. Once a technology is commodotized, the low-end stuff is usually widely proliferated. Night vision goggles are an example, only the latest generation technology is controlled, and low-end stuff can be bought online and shipped worldwide.

Applying this to your thoughts about AI, is that as the efficiency of training gets better, the ability to train models is commodotized, and those models would not be considered to be advantageous and would not need to be controlled. So maybe setting the export control based on the number of operations is a good idea- it naturally allows efficiently trained models to be exported since they wouldn't be hard to train in other countries anyway.

As computing power scales maybe the 10^26 limit will need to be revised, but setting the limit based on the scale of the training is a good idea since it is actually measurable. You couldn't realistically set the limit based on the capability of the model since benchmarks seem become irrelevant every few months due to contamination.

show 1 reply
thorumlast Thursday at 9:18 PM

The practical problem I see is that unless US AI labs have perfect security (against both cyber attacks and physical espionage), which they don’t, there is no way to prevent foreign intelligence agencies from just stealing the weights whenever they want.

show 2 replies
etiamlast Thursday at 9:04 PM

Could be nice with some artificial pressure to use more efficient algorithms though. The current game of just throwing in more data centers and power plants may be kind of convenient for those who can afford it, but it's also intellectually embarrassing.

logicchainsyesterday at 6:37 AM

>The most salient thing in the document is that it put export controls on releasing the weights of models trained with 10^26 operations.

Does this affect open source? If so, it'll be absolutely disastrous for the US in the longer term, as eventually China will be able to train open weights models with more than that many operations, and everyone using open weights models will switch to Chinese models because they're not artificially gimped like the US-aligned ones. China already has the best open weights models currently available, and regulation like this will just further their advantage.

show 1 reply
permo-wlast Thursday at 9:52 PM

this is like saying that regulating automatic weapons is misguided because someone might invent a gun that is equally dangerous without being automatic

show 1 reply
HeatrayEnjoyerlast Thursday at 8:57 PM

We can't let perfect be the enemy of good, regulations can be updated. Capping FLOPs is a decent starter reg.

show 1 reply