Bit of a wider discussion, but how do you all feel about the fact that you're letting a program use your computer to do whatever it wants without you knowing? I know right now LLMs aren't overly capable, but if you'd apply this same mindset to an AGI, you'd probably very quickly have some paperclip-maximizing issues where it starts hacking into other systems or similar. It's sort of akin to running experiments on contagious bacteria in your backyard, not really something your neighbors would appreciate.
Programs can’t want things, it’s no different than running any other program as your user
Try asking the latest Claude models about self replicating software and see what happens...
(GPT recently changed its attitude on this subject too which is very interesting.)
The most interesting part is that you will be given the option to downgrade the conversation to an older model. Implying that there was a step change in capability on this front in recent months.
The point of TFA is that you are not letting it do whatever it wants, you are restricting it to just the subset of files and capabilities that you mount on the VM.
I run mine in a docker container and they get read only access to most things.
Don't you have the same issue when you hire an employee and give them access to your systems? If the AI seems capable of avoiding harm and motivated to avoid harm, then the risk of giving it access is probably not greater than the expected benefit. Employees are also trying to maximize paperclips in a sense, they want to make as much money as possible. So in that sense it seems that AI is actually more aligned with my goals than a potential employee.