When building ikiwiki from a tarball, the mtime (conceptually, the
last modification date of the file) is preserved by tar, but the inode
change time (creation/metadata-change date of *this copy* of the file)
is not. This seems to lead to unstable sort ordering and
unreproducible builds.
The page can't possibly have been modified before it was created, so
we can assume that the modification date is an upper bound for the
creation date.
This doesn't prevent memory from being used to track what we have
and haven't scanned, but it does make it temporary.
This only applies to rebuilds, as a way to avoid breaking the
templatebody plugin, unlike the earlier version of this optimization.
This reverts commit c04a26f3e7, which
turns out to break the templatebody directive: readtemplate() relies
on scan() populating %templates, but if scan() is a no-op after
leaving the scan phase, we can't rely on that.
The assumption made by skipping scan() after the end of the render phase
is that everything that comes from a scan is already in the index.
However, we don't really want to put template bodies in the index:
that would force us to load and save them on every refresh, and
redundantly persist them to disk.
Test-case:
% make clean
% ./Makefile.PL
% make
% grep -E '<div class="notebox">|Use this template to' html/sandbox.html
% touch doc/sandbox/New_blog_entry.mdwn # sandbox inlines this
% make
% grep -E '<div class="notebox">|Use this template to' html/sandbox.html
Good result: html/sandbox.html contains <div class="notebox"> both times
Bad result: html/sandbox.html contains "Use this template to..." the
second time
Mobile browsers typically assume that arbitrary web pages are
designed for a "desktop-sized" browser window (around 1000px)
and display that layout, zoomed out, in order to avoid breaking
naive designs that assume nobody will ever look at a website on
a phone or something. People who are actually doing "responsive
design" need to opt-in to mobile browsers rendering it at a
more normal size.
srcfile_stat got called on a file from the underlay that no longer existed.
I am not 100% sure of the circumstances of that; I was able to reproduce
the bug but neglected to snapshot the tree, and then accidentially
got it to stop crashing. I know that a transient tag page got deleted using
the web interface to trigger the crash.
It seems that process_changed_files must have returned the file, despite it
being deleted. And since the file was not checked into git, it seems it
must have not been included in @IkiWiki::underlayfiles, which would have
caused process_changed_files to not return it.
I do not know why a transient tag page would not be in
@IkiWiki::underlayfiles. There is a bug here that I don't understand.
This is just a workaround -- run srcfile_stat such that it won't crash,
and if it is unable to stat a file, find_changed knows it's not changed,
so it's ok to skip it.
Also made find_new_files run srcfile_stat such that it won't crash, just
because I was there.
This doesn't prevent memory from being used to track what we have
and haven't scanned, but it does make it temporary. The existing
%rendered hash, which is filled afterwards, will be larger than %scanned
in practice anyway: %scanned will contain an entry for each page
that changed, plus an entry for each template used by templatebody,
whereas %rendered will contain an entry for each page that changed
plus an entry for each page rendered due to links or dependencies.
In the scan phase, it's too early to match pagespecs or sort pages;
in the render phase, both of those are OK.
It would be possible to add phases later, renumbering them if necessary
to maintain numerical order.
If it does nothing when a page has already been scanned, we can use it
at any time to force a page to be scanned. In particular, the
templatebody plugin is going to need this.
I saw this happen with calendar, when it wanted to update a page, that
had a calendar on it, but the page had just been deleted. This caused
srcfile_stat to crash.
Previously, prune("wiki/srcdir/sandbox/test.mdwn") could delete srcdir
or even wiki, if they happened to be empty. This is rarely what you
want: there's usually some base directory (destdir, srcdir, transientdir
or another subdirectory of wikistatedir) beyond which you do not want to
delete.
This is needed for the po plugin vs. e.g. meta titles.
In order to get rid of the ugly "rebuilding all pages to fix meta titles" thing,
Joey suggested to make "po, at scan time, re-run the scan hooks, passing them
modified content (either converted from po to mdwn or with the escaped stuff
cheaply de-escaped)". This would unfortunately not work, as the meta plugin
gathers its data using the preprocess hook in scan mode: it would overwrite with
buggy data the correct data we would have forced it to gather in po's scan hook.
We then need a hook that runs *after* the preprocess hook has been run in scan
mode, but *before* any page rendering is started. Hence this one.
There was some confusion about whether the filename was
relative to srcdir or not. Some test cases, and the bzr
plugin assumed it was relative to the srcdir. Most everything else
assumed it was absolute.
Changed it to relative, for consistency with the rest
of the rcs_ functions.
A short story:
Once there was a unicode string, let's call him Srcdir.
Along came a crufy old File::Find, who went through a tree and pasted each
of the leaves in turn onto Srcdir. But this 90's relic didn't decode the
leaves -- despite some of them using unicode! Poor Srcdir, with these
leaves stuck on him, tainted them with his nice unicode-ness. They didn't
look like leaves at all, but instead garbage.
(In other words, perl's unicode support sucks mightily, and drives
us all to drink and bad storytelling. But we knew that..)
So, srcdir is not normally flagged as unicode, because typically it's pure
ascii. And in that case, things work ok; File::Find finds filenames, which
are not yet decoded to unicode, and appends them to the srcdir, and then
decode_utf8 happily converts the whole thing.
But, if the srcdir does contain utf8 characters, that breaks. Or, if a Yaml
setup file is used, Yaml::Syck's implicitunicode sets the unicode flag of
*all* strings, even those containing only ascii. In either case, srcdir
has the unicode flag set; a non-decoded filename is appended, and the flag
remains set; and decode_utf8 sees the flag and does *nothing*. The result
is that the filename is not decoded, so looks valid and gets skipped.
File::Find only sticks the directory and filenames together in no_chdir
mode .. but we need that mode for security. In order to retain the
security, and avoid the problem, I made it not pass srcdir to File::Find.
Instead, chdir to the srcdir, and pass ".". Since "." is ascii, the problem
is avoided.
Note that chdir srcdir is safe because we check for symlinks in the srcdir
path.
Note that it takes care to chdir back to the starting location. Because
the user may have specified relative paths and so staying in the srcdir
might break. A relative path could even be specifed for an underlay dir, so
it chdirs back after each.
A short story:
Once there was a unicode string, let's call him Srcdir.
Along came a crufy old File::Find, who went through a tree and pasted each
of the leaves in turn onto Srcdir. But this 90's relic didn't decode the
leaves -- despite some of them using unicode! Poor Srcdir, with these
leaves stuck on him, tainted them with his nice unicode-ness. They didn't
look like leaves at all, but instead garbage.
In other words, perl's unicode support sucks mightily, and drives
us all to drink and bad storytelling. But we knew that..
So, srcdir is not normally flagged as unicode, because typically it's pure
ascii. And in that case, things work ok; File::Find finds filenames, which
are not yet decoded to unicode, and appends them to the srcdir, and then
decode_utf8 happily converts the whole thing.
But, if the srcdir does contain utf8 characters, that breaks. Or, if a Yaml
setup file is used, Yaml::Syck's implicitunicode sets the unicode flag of
*all* strings, even those containing only ascii. In either case, srcdir
has the unicode flag set; a non-decoded filename is appended, and
decode_utf8 sees the flag and does *nothing*. The result is that the
filename is not decoded, so looks valid and gets skipped.
File::Find only sticks the directory and filenames together in no_chdir
mode .. but we need that mode for security. In order to retain the
security, and avoid the problem, I made it not pass srcdir to File::Find.
Instead, chdir to the srcdir, and pass ".". Since "." is ascii, the problem
is avoided.
Note that it takes care to chdir back to the starting location. Because
the user may have specified relative paths and so staying in the srcdir
might break. A relative path could even be specifed for an underlay dir, so
it chdirs back after each.
Problem is that by the time rendering calls render_dependent, %pagesources
has had deleted files removed from it. So match_comment's lookup of
files in there to see if they had the _comment extension failed.
I had to introduce a hash that temporarily holds filenames of deleted pages
to fix this.
Note that unlike comment(), internal() had avoided this pitfall by being
defined to match both internal and non-internal pages.
* Ikiwiki can be configured to generate html5 instead of the default xhtml
1.0. The html5 output mode is experimental, not yet fully standards
compliant, and will be subject to rapid change.
Needed to handle the move of the .js files into ikiwiki/, but also this is
a longstanding bug.
Old pagemtime is not remembered in rebuild mode, and changing that would
need a lot of changes. So instead, loop on pagectime, which is remembered.
Change to remembering old pagesources info in rebuild mode. This seems safe
enough.