Allow colons in URLs after the first slash

A new regexp fixes this bug:
http://ikiwiki.info/bugs/No_link_for_blog_items_when_filename_contains_a_colon/

I traced this down to htmlscrubber. If disabled,
it works. If enabled, then $safe_url_regexp
determines the URL unsafe because of the colon and
hence removes the src attribute.

Digging into this, I find that RFC 3986 pretty
much discourages colons in filenames:

"""
A path segment that contains a colon character
(e.g., "this:that") cannot be used as the first
segment of a relative-path reference, as it would
be mistaken for a scheme name. Such a segment must
be preceded by a dot-segment (e.g., "./this:that")
to make a relative- path reference.
"""

on the other hand, with usedirs, any link to
another page will be prepended by ../ anyway, so
that makes them okay again.

The solution still seems not to use colons.

In any case, htmlscrubber should get a new regexp,
courtesy of dato.

I have tested and verified this.

Signed-off-by: martin f. krafft <madduck@madduck.net>
master
Adeodato Simó 2008-02-29 19:26:53 +01:00 committed by martin f. krafft
parent 12c5361c73
commit be0b4f603f
1 changed files with 1 additions and 1 deletions

View File

@ -31,7 +31,7 @@ sub import { #{{{
);
# data is a special case. Allow data:image/*, but
# disallow data:text/javascript and everything else.
$safe_url_regexp=qr/^(?:(?:$uri_schemes):|data:image\/|[^:]+$)/i;
$safe_url_regexp=qr/^(?:(?:$uri_schemes):|data:image\/|[^:]+(?:$|\/))/i;
} # }}}
sub sanitize (@) { #{{{