- Migrate the set of deletions to the {autofile} set, since it has
more or less the same effect. This affects the "deleted" case in the
test.
- If a page has just been deleted, add it as an autofile anyway: by
the time gen_autofile is called, it'll be in the list of deleted files,
so it'll just be added to {autofile}. This affects the "gone" case
in the test.
- Behaviour change: we don't forget that a page with no reason to be
re-created was deleted. This affects the 'expunged' and 'reinstated'
cases in the test.
This does cause a minor regression: index pages are now committed
individually rather than being a single commit per rebuild.
This also means the autoindex regression test needs to trigger the
autofile generation pass.
Using named parameters for these is overdue. Passing the session in a
parameter instead of passing username and IP separately will later allow
storing other session info, like username or part of the email.
Note that these functions are not part of the exported API,
and the prototype change will catch (most) skew, so I am not changing
API versions. Any third-party plugins that call them will need updated
though.
A short story:
Once there was a unicode string, let's call him Srcdir.
Along came a crufy old File::Find, who went through a tree and pasted each
of the leaves in turn onto Srcdir. But this 90's relic didn't decode the
leaves -- despite some of them using unicode! Poor Srcdir, with these
leaves stuck on him, tainted them with his nice unicode-ness. They didn't
look like leaves at all, but instead garbage.
(In other words, perl's unicode support sucks mightily, and drives
us all to drink and bad storytelling. But we knew that..)
So, srcdir is not normally flagged as unicode, because typically it's pure
ascii. And in that case, things work ok; File::Find finds filenames, which
are not yet decoded to unicode, and appends them to the srcdir, and then
decode_utf8 happily converts the whole thing.
But, if the srcdir does contain utf8 characters, that breaks. Or, if a Yaml
setup file is used, Yaml::Syck's implicitunicode sets the unicode flag of
*all* strings, even those containing only ascii. In either case, srcdir
has the unicode flag set; a non-decoded filename is appended, and the flag
remains set; and decode_utf8 sees the flag and does *nothing*. The result
is that the filename is not decoded, so looks valid and gets skipped.
File::Find only sticks the directory and filenames together in no_chdir
mode .. but we need that mode for security. In order to retain the
security, and avoid the problem, I made it not pass srcdir to File::Find.
Instead, chdir to the srcdir, and pass ".". Since "." is ascii, the problem
is avoided.
Note that chdir srcdir is safe because we check for symlinks in the srcdir
path.
Note that it takes care to chdir back to the starting location. Because
the user may have specified relative paths and so staying in the srcdir
might break. A relative path could even be specifed for an underlay dir, so
it chdirs back after each.
I considered not checking them in, or making the checkin configurable.
However, then they would remain not checked in if edited by a user, which is
probably not desired.
Note that passing undef as the username/ip to rcs_commit_staged may not
result in ideal behavior; the commit may seem to come from "anonymous" with
some revision control systems. Most of them handle it a bit better and just
have it come from whatever user is running the build.