I would agree if it wasn't for the fact that extracting that volume of data from a properly secured corporate network should be hard. It should raise some flags if a such a high volume of data is downloaded to a user's local machine from the training or production environments.
There are sooooo many exfil methods, including with air gapped systems that are off-network.
Not at all beyond the capabilities of any of the top ~9 or so best State actors.
Edit: To answer your question, very easily on the 20TB.
One crude method with a simple device in particular works well if you just clone the monitor data and then use HDMI and pass through. Then just cat dir in encrypted chunks to something like a USB key connected to the passthrough. 4TB USB keys are out there. A week of that gets you 20TB.
I have no proof one way or the other if Anthropic or OpenAI have "properly secured corporate networks". Both seem like fast changing places with lots of servers and workers. Seems most likely to me that someone somewhere made a mistake or missed something due to all the change and their network is not 100% secure.
But even if their networks are secure, I think that spies who are willing to coerce people, trick people and go in person to data centers or offices could find a way to get those models and other things.