At this point I expect any camera ASICs to be able to incorporate this logic for plenty-fast processing. Or to do it when writing out the image file, after acquiring it to a buffer.
Your raw-image idea is interesting. I'm curious as to how photosites' arrangement would play into this.