Commit Graph

6 Commits (82649be475f0d3459ea09d4fe6d65dfe212f0a80)

Author SHA1 Message Date
chrysn 714de9f101 response 2008-09-28 05:47:20 -04:00
Joey Hess 83429adf04 decode utf-8 in recentchanges_link parameter 2008-09-26 15:40:01 -04:00
chrysn 1d12dfc57c update: possible solution 2008-09-26 12:44:09 -04:00
chrysn 947a50fe5d some problem remains 2008-09-26 09:05:01 -04:00
Joey Hess 965f7310fe git: Fix handling of utf-8 filenames in recentchanges.
Seems that the problem is that once the \nnn coming from git is converted
to a single character, decode_utf8 decides that this is a standalone
character, and not part of a multibyte utf-8 sequence, and so does nothing.
I tried playing with the utf-8 flag, but that didn't work. Instead, use
decode("utf8"), which doesn't have the same qualms, and successfully
decodes the octets into a utf-8 character.

Rant:

Think for a minute about fact that any and every program that parses git-log,
or git-show, etc output to figure out what files were in a commit needs to
contain this snippet of code, to convert from git-log's wacky output to a
regular character set:

if ($file =~ m/^"(.*)"$/) {
	($file=$1) =~ s/\\([0-7]{1,3})/chr(oct($1))/eg;
}

(And it's only that "simple" if you don't care about filenames with
embedded \n or \t or other control characters.)

Does that strike anyone else as putting the parsing and conversion in the
wrong place (ie, in gitweb, ikiwiki, etc, etc)? Doesn't anyone who actually
uses git with utf-8 filenames get a bit pissed off at seeing \xxx\xxx
instead of the utf-8 in git-commit and other output?
2008-09-25 18:26:42 -04:00
chrysn 88beb46def bug report on funny characters in the name 2008-09-23 04:58:35 -04:00