It's funny seeing this play out because in my personal life anytime I'm sharing a sensitive document where someone needs to see part of it but I don't want them to see the rest that's not relevant, I'll first block out/redact the text I don't want them to see (covering it, using a redacting highlighter thing, etc.), and then I'll screenshot the page and make that image a PDF.
I always felt paranoid (without any real evidence, just a guess) that there would always be a chance that anything done in software could be reversed somehow.
I learned that a long time ago when I was a student and wanted to submit a pdf generated by a trial version of some software as an assignment and was trying to be clever and cover the watermark that said unregistered with a white box.
When opening the file in my slow computer, I could see all the rendering of the watermark happening in slow motion until the white box would pop up on top of the text.
I'll just send an image and not bother with a PDF.
(Note there's also other metadata in a PDF, which you may not want your recipient to know either.)
Maybe the person tasked with the redacting didn't agree so they chose the worst possible way to do it.
it's absolutely bewildering how ridiculous everything has been so far in terms of competence and this really takes the cherry on the top near Christmas too.
how much lower can they go ?!
Personally, I only trust an image manipulation tool to put down solid colored blocks, or something that does not involve the source pixels when deciding on the redacted pixel. Formats like PDF are just so complicated to trust.
And even being this careful, if the opacity is slightly off it could be undone
The one that was crazy to me is undoing a blur effect (based on its algo), so yeah I also will layer and screenshot something
This is what I do while sharing such images. I crop out those parts first and then take another screenshot. I do not even risk painting over and then take another screenshot. I have been doing this forever.
In practical terms, a more convenient way to achieve this is just printing the document to a PDF, which rasterises the visible layer into what the printer would see. Most pdf tools support this.
I then convert the image to grayscale only. Then I apply a filter so that only 16 colors are used. And I then adjust brightness/contrast so that "white is really white". It's all scripted: "screenshot to PDF". One of my oldest shell script.
16 shades of grey (not 50) is plenty enough for text to still be smooth.
I do it for several reasons, one of them being I often take manual notes on official documents (which infuriates my wife btw) but then sometimes I need to then scan the documents and send them (local IRS / notary / bank / whatever). So I'll just scan then I'll fill rectangle with white where I took handnotes. Another reason is when there's paper printed on two sides, at scan times sometimes if the paper is thin / ink is thick, the other side shall show.
I wonder how that'd work vs adversarial inputs: never really thought about it.
If it's not done properly, and you happen at any point in the chain to put black blocks on a compressed image (and PDF do compress internal images), you are leaking some bits of information in the shadow casted by the compression algorithm : (Self-plug : https://github.com/unrealwill/jpguncrop )