check for invalid utf-8, and toss it back to avoid crashes

Since ikiwiki uses open :utf8, perl assumes that files contain valid utf-8.
If it turns out to be malformed it may later crash while processing strings
read from them, with 'Malformed UTF-8 character (fatal)'.

As at least a quick fix, use utf8::valid as soon as data is read, and if
it's not valid, call encode_utf8 on the string, thus clearing the utf-8
flag. This may cause follow-on encoding problems, but will avoid this
crash, and the input file was broken anyway, so GIGO is a reasonable
response. (I looked at calling decode_utf8 after, but it seemed to cause
more trouble than it was worth. BTW, use open ':encoding(utf8)' avaoids
this problem, but the corrupted data later causes Storable to crash when
writing the index.)

This is a quick fix, clearly imperfect:
- It might be better to explicitly call decode_utf8 when reading files,
  rather than using the IO layer.
- Data read other than by readfile() can still sneak in bad utf-8. While
  ikiwiki does very little file input not using it, stdin for the CGI
  would be one way.
master
Joey Hess 2008-11-12 17:19:41 -05:00
parent 2c858c9c95
commit 716560b7f1
3 changed files with 19 additions and 0 deletions

View File

@ -721,6 +721,10 @@ sub readfile ($;$$) { #{{{
binmode($in) if ($binary);
return \*$in if $wantfd;
my $ret=<$in>;
# check for invalid utf-8, and toss it back to avoid crashes
if (! utf8::valid($ret)) {
$ret=encode_utf8($ret);
}
close $in || error("failed to read $file: $!");
return $ret;
} #}}}

6
debian/changelog vendored
View File

@ -1,3 +1,9 @@
ikiwiki (2.70) UNRELEASED; urgency=low
* Avoid crash on malformed utf-8 discovered by intrigeri.
-- Joey Hess <joeyh@debian.org> Wed, 12 Nov 2008 17:30:33 -0500
ikiwiki (2.69) unstable; urgency=low
* Avoid multiple ikiwiki cgi processes piling up, eating all memory,

View File

@ -407,3 +407,12 @@ discovered on 30 May 2008 and fixed the same day. ([[!cve CVE-2008-0169]])
I recommend upgrading to 2.48 immediatly if your wiki allows both password
and openid logins.
## Malformed UTF-8 DOS
Feeding ikiwiki page sources containing certian forms of malformed UTF-8
can cause it to crash. This can potentially be used for a denial of service
attack.
intrigeri discovered this problem on 12 Nov 2008 and a patch put in place
later that day.