logoalt Hacker News

The woes of sanitizing SVGs

204 pointsby varun_chyesterday at 3:31 PM84 commentsview on HN

Comments

simonwyesterday at 4:40 PM

I'm glad this article includes the only credible fix for the HTTP leak problems: CSP.

A useful thing I learned recently is that, while CSP headers are usually set using HTTP headers, you can also reliably set them directly in HTML - for example for HTML generated directly on a page where HTTP headers don't come into play:

  <iframe sandbox="allow-scripts" srcdoc="
    <meta http-equiv='Content-Security-Policy'
        content='default-src none; script-src unsafe-inline; style-src unsafe-inline;'>
    <!-- untrusted content here -->
  "></iframe>
It feels like this shouldn't work, because JavaScript in the untrusted content could use the DOM to delete or alter that meta tag... but it turns out all modern browsers specifically lock that down, treating those CSP rules as permanent as soon as that meta tag has loaded before any malicious code has the chance to subvert them.

I had Claude Code run some experiments to help demonstrate this a few weeks ago: https://github.com/simonw/research/tree/main/test-csp-iframe...

show 4 replies
andybakyesterday at 3:55 PM

My first thought is "support a tiny subset of svg that probably still covers 90% of real-world use cases".

I do feel that's there's two distinct types of svg - "bunch of paths with fills" and "clever dangerous stuff" where most real SVGs are of the former type.

Fully expect this to be shot down by someone that's thought about this problem for longer than the 120 seconds I just spent. :)

show 14 replies
nmiloyesterday at 10:41 PM

I'm sorry because I love the scratch project but this has to be said: they found XSS in SVGs in a surface with attacker-controlled access to Node and their fix was sanitizing it using regex??? And this was discovered by a user on scratch?

Even worse, OP's latest post "Every version of Scratch is vulnerable to arbitrary code execution" just tells you how exactly to exploit something similar today in the current version with no mention of responsible disclosure except a plug to say, "hey, check out my project, this one doesn't have RCE!" This is so irresponsible it borders on malicious.

show 1 reply
evilpieyesterday at 5:22 PM

The HTML Sanitizer API has a subset of SVG that is allowed by the default configuration. It won't help you with sanitizing CSS at all however, style is simply not allowed by default.

https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...

https://developer.mozilla.org/en-US/docs/Web/API/HTML_Saniti...

show 1 reply
codedokodetoday at 4:25 AM

I don't like that SVG uses things like CSS and JS and requires pulling in the whole browser to display. Instead of being a simple vector image format, it became just an extension of HTML. Maybe we need a new format, and if someone decides to do it, please add ability to embed fonts, wrap text and decent animations.

philo23yesterday at 4:33 PM

It'd be nice if there was a sandbox attribute you could add to inline <svg> tags, like the <iframe sandbox> attribute that'd let you opt out of all the potentially "dynamic" stuff inside of an SVG like scripts and event handlers, or even just literally sandbox the entire thing from accessing the "parent" HTML page's context/cookies/etc just like an iframe.

I'm sure it'd just open up a whole other can of worms though... not to mention having to wait for browsers to actually support it.

The real solution here is definitely CSP + basic sanitisation though.

show 4 replies
ikkunyesterday at 4:27 PM

I do wish tinyVG or similar would take off, but I don't see that ever actually happening. the only thing I think it's missing is animation support, which is pretty niche but not as niche as <script> tags.

https://tinyvg.tech/

spankaleeyesterday at 3:47 PM

This is, by the way, why Google Slides doesn't have SVG support even though there's a nearly 15 year old ticket requesting the feature.

show 2 replies
djoldmantoday at 12:06 AM

Cloudflare went down this road a bit:

https://github.com/cloudflare/svg-hush

Springtimeyesterday at 4:14 PM

It seems the reason they're inlined in the page at all is to measure things briefly like bounding boxes (not sure the full extent as it didn't cover that), before subsequent removal. I'm not familiar with Scratch and its use of user-submitted SVGs but I'd be curious to read more about what they're doing that required it be inlined specifically.

(This isn't a comment on the challenges in proper sanitization fwiw, as I've needed to do various of the same things myself)

show 1 reply
bawolffyesterday at 7:05 PM

These aren't really SVG specific issues. They are all pretty standard XSS that apply to html and are very well known vectors.

Like this post didn't even mention presentational attributes, like how cursor attribute can contain a url that gets loaded. Or any of the other tricky parts of svg sanitization, like using dtd to bypass things.

Liftyeeyesterday at 9:01 PM

I'm not familiar with the details of real software development, so I don't know why it's not possible to just "not give the SVG part of the code internet access" or "perform sanitization on post-decoding (url, hex, etc) data".

Is it because the SVG parser/renderer being used is an entire library, and it would be prohibitive to write your own SVG parser/renderer or insert your own code into the existing one?

show 1 reply
kevinmgrangeryesterday at 5:22 PM

> This was fixed by using a regular expression to remove script tags.

The infamous you can't parse (X)HTML with regex¹ meme is from 2009, yet this fix was done in 2019. I guess the SO answer never mentioned SVG.

1: https://stackoverflow.com/revisions/1732454/1

jancsikayesterday at 5:11 PM

For the "<script>" stuff: regardless of how the thing is spelled or otherwise obscured, the HTML5 parser eventually knows when it's gotten hold of a script tag. Oops, we got one in a NOSCRIPTTAG context. Let's poop out.

Tag names, attributes, attribute values, event callback default-cancelers... so many ways to declare that this node and its children shouldn't parse/evaluate scripts.

As Jay-Z said: "I've got 99 solutions, fixing a problem ain't one"

etchalonyesterday at 4:41 PM

I don't understand why it wasn't immediately understood that SVG is as dangerous as HTML.

It is not, and never was, an image format. It's a markup language.

show 2 replies
NooneAtAll3yesterday at 5:51 PM

wait... scratch is just a browser?

show 1 reply
Theodoresyesterday at 6:22 PM

Maybe we need a dumbed down version 3 of SVG where the browser knows it is not to do anything that requires fetching a URL, to make the image as harmless as a JPG.

This version 3 could have the version number changed to 2 in order to do cool SVG things, so full-fat SVG as version 2 is now. But you could just flip to 2 to a 3 on upload, so any embedded URLs are harmless.

This could be useful for the creator too, as it is helpful to have layers of source images in bitmap format to work with, and you can easily export such things accidentally.

Devastayesterday at 6:13 PM

> In 2019, a few months after the initial release of Scratch 3, Scratch discovered that SVGs can contain <script> tags that Scratch would cause to be executed when the SVG loads. This is known as an XSS.

> Example from Scratch's test suite:

  <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
    "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
  <svg version="1.1" xmlns="http://www.w3.org/2000/svg">
    <circle cx="250" cy="250" r="50" fill="red" />
    <script type="text/javascript"><![CDATA[
        alert('from the svg!')
    ]]></script>
  </svg>


Is this really an issue? This is the method that the chrome teams polyfill to replace XSLT suggests you do. https://github.com/mfreed7/xslt_polyfill/tree/main#usage
show 1 reply
esafakyesterday at 4:18 PM

Is there a browser-friendly vector alternative?

show 1 reply
marlburrowtoday at 2:06 AM

[dead]

nengilyesterday at 3:55 PM

[flagged]

shaguoeryesterday at 4:59 PM

[flagged]

SpyCoder77yesterday at 4:53 PM

I did not expect to see GarboMuffin.