ikiwiki/doc/htmlsanitization.mdwn

31 lines
1.9 KiB
Markdown

When run with the --sanitize switch, which is turned on by default (see
[[usage]]), ikiwiki sanitizes the html on pages it renders to avoid XSS
attacks and the like.
ikiwiki excludes all html tags and attributes except for those that are
whitelisted using the same lists as used by Mark Pilgrim's Universal Feed
Parser, documented at <http://feedparser.org/docs/html-sanitization.html>.
Notably it strips `style`, `link`, and the `style` attribute.
ikiwiki uses the HTML::Scrubber perl module to perform its html
sanitisation, and this perl module also deals with various entity encoding
tricks.
While I beleive that this makes ikiwiki as resistant to malicious html
content as anything else on the web, I cannot guarantee that it will
actually protect every user of every browser from every browser security
hole, badly designed feature, etc. I can provide NO WARRANTY, like it says
in ikiwiki's [GPL](GPL) license.
The web's security model is *fundamentally broken*; ikiwiki's html
sanitisation is only a patch on the underlying gaping hole that is your web
browser.
----
Some examples of embedded javascript that won't be let through.
<span style="background: url(javascript:window.location='http://example.org/')">test</span>
<span style="&#x61;&#x6e;&#x79;&#x3a;&#x20;&#x65;&#x78;&#x70;&#x72;&#x65;&#x73;&#x73;&#x69;&#x6f;&#x6e;&#x28;&#x77;&#x69;&#x6e;&#x64;&#x6f;&#x77;&#x2e;&#x6c;&#x6f;&#x63;&#x61;&#x74;&#x69;&#x6f;&#x6e;&#x3d;&#x27;&#x68;&#x74;&#x74;&#x70;&#x3a;&#x2f;&#x2f;&#x65;&#x78;&#x61;&#x6d;&#x70;&#x6c;&#x65;&#x2e;&#x6f;&#x72;&#x67;&#x2f;&#x27;&#x29;">test</span>
<span style="&#97;&#110;&#121;&#58;&#32;&#101;&#120;&#112;&#114;&#101;&#115;&#115;&#105;&#111;&#110;&#40;&#119;&#105;&#110;&#100;&#111;&#119;&#46;&#108;&#111;&#99;&#97;&#116;&#105;&#111;&#110;&#61;&#39;&#104;&#116;&#116;&#112;&#58;&#47;&#47;&#101;&#120;&#97;&#109;&#112;&#108;&#101;&#46;&#111;&#114;&#103;&#47;&#39;&#41;">test</span>