> 1. For most people using multiple SCMs is just a huge and easily-avoidable mistake. Most people can just mandate a single SCM for a project and then all these problems are moot.
You talk about SCMs, we're talking about VCSs. Where it's not just source code under control, or even source code with a handful of binary assets. Imagine dealing with a VCS that has to handle 15 years and a few petabytes of binary assets. Or individual files that were multiple gigabytes and had changes made to them several times per day. Can git do that gracefully just by itself? Or SVN? Even Perforce struggled with something like that back in the day.
>Very unclear why anyone needs this. There's no standard way to code-review a binary diff (it depends what the blob is that you're diffing) so how would it help if you had this standard way to represent the diff?
A standard way of handling the binary data doesn't mean understanding the binary data. You can leave that up to specific tools. What you need though is a way to somehow package up and describe those binary diffs enough that you can transport the diff data and pick the right tool to show you the actual differences.
> This goes away if you just use one SCM for a project which you should anyway for everyone's sanity.
And if wishes were fishes, I'd never be hungry again. If you have a lot of history, a lot of data, a lot of workflows and tools built up around multiple VCSs, then changing that to just one VCS is going to be a massive undertaking. And not every VCS can handle all of the kinds of data that might get input into it. Some are going to be good at text data, some might handle binary assets better. Some might have a commit model that makes sense for one type of workflow but not for another. For example, you might be dealing with binary assets where you can only have one person working on a specific file at a time because there's no real way to merge changes from multiple people, so they need to lock it. For text assets though, you might be able to handle having multiple people work on a file. To afford both workflows, your VCS now needs to not only support both locking modes, but be hyper-aware of the specific content to know which kind of locking to permit for specific files.
The world doesn't always fit into the nice little models that the most popular VCSs provide. So if you're trying to not limit your product to supporting just those handful of popular VCSs, you can't just assume everything will fit into one of those models.
> Imagine dealing with a VCS that has to handle 15 years and a few petabytes of binary assets.
Part of the problem is that you're fabricating imaginary problems that no one is actually experiencing, and only to try to argue that the solution for this imaginary problems is a file format.
Does this sound reasonable to you?