The new git-notes feature in git 1.6.6 changes git log output in a way that
broke ikiwiki's parser if notes are added to commits.
I decided to deal with this by disabling notes when ikiwiki uses git,
by setting GIT_NOTES_REF="". AFAICS, looking up notes when dumping logs
will only waste time, since it does not currently seem to make sense for
ikiwiki to do anything with the notes.
git log --follow seems to sometimes show merges from before the file was
ever created. So, skip them, a file shouldn't be first created during a
merge anyway.
This will be a bit more expensive, but --getctime does not need to be fast.
And getting the real creation time a very useful when untangling blog
histories that involve renames.
Loading and use of IkiWiki::Receive can all be pushed into the git plugin,
rather than scattered around.
I had at first wanted to make a receive plugin and move it there,
but a plugin was not a good fit; you don't want users to have to manually
load it, and making the git plugin load the receive plugin at the right
times would need more, and ugly code.
I guess what's happening here is that since the name
is passed to git via an environment variable, perl's normal
utf-8 IO layer stuff doesn't work. So we have to explicitly
decode the string from perl's internal representation into
utf-8.
Seems that the problem is that once the \nnn coming from git is converted
to a single character, decode_utf8 decides that this is a standalone
character, and not part of a multibyte utf-8 sequence, and so does nothing.
I tried playing with the utf-8 flag, but that didn't work. Instead, use
decode("utf8"), which doesn't have the same qualms, and successfully
decodes the octets into a utf-8 character.
Rant:
Think for a minute about fact that any and every program that parses git-log,
or git-show, etc output to figure out what files were in a commit needs to
contain this snippet of code, to convert from git-log's wacky output to a
regular character set:
if ($file =~ m/^"(.*)"$/) {
($file=$1) =~ s/\\([0-7]{1,3})/chr(oct($1))/eg;
}
(And it's only that "simple" if you don't care about filenames with
embedded \n or \t or other control characters.)
Does that strike anyone else as putting the parsing and conversion in the
wrong place (ie, in gitweb, ikiwiki, etc, etc)? Doesn't anyone who actually
uses git with utf-8 filenames get a bit pissed off at seeing \xxx\xxx
instead of the utf-8 in git-commit and other output?
Move rcs plugin load to loadplugins; move duplicate rcs detection logic out
of individual plugins and into loadplugins. Avoids checkconfig failing when
run twice.