This can be a lot faster, since huge numbers of pages are not sorted
only to mostly be thrown away. It sped up a build of my blog by at least
5 minutes.
Both markdown and tidy add paragraph tags around text, that needs to be
stripped when the text is a short, one line fragment that is being inserted
into a larger page. tidy also adds several newlines to the end, and this
broke removal of the paragraph tags.
pagespec_translate may set $@ if it fails to parse a pagespec, but
due to memoization, this is not reliable. If a memoized call is repeated,
and $@ is already set for some other reason previously, it will remain
set through the call to pagespec_translate.
Instead, just check if pagespec_translate returns undef.
Finally removed the last hardcoding of IkiWiki::Setup::Standard.
Take the first "IkiWiki::Setup::*" in the setup file to define the
setuptype, and remember that type to use in dumping later. (But it can be
overridden using --set, etc.)
Also, support setup file types that are not evaled.
As I was adding ngettext support, I realized I could optimize the gettext
functions by memoizing the creation of the gettext object. Note that
the object creation is still deferred until a gettext function is called,
to avoid unnecessary startup penalties on code paths that do not need
gettext.
A side benefit is that separate stub functions are no longer needed to
handle the C language case.
bestlink was looking at whether %links existed for a page in order to tell
if the page exists, but just-deleted pages still have entries in there (for
reasons it may be best not to explore). So bestlink would return
just-deleted pages. Instead, make bestlink use %pagesources.
Also, when finding a deleted page, %pagecase was not cleared of that page.
This, again, made bestlink return just-deleted pages. Now that is cleared.
Fixing bestlink exposed another issue though. The backlink calculation code
uses bestlink. So when a page was deleted, no backlinks to it are found,
and pages that really did backlink to it were not updated, and had broken
links.
To fix that, the code that actually removes deleted pages had to be split
out from find_del_files, so it can run a bit later. It is run just after
backlinks are calculated. This way, backlink calculation still sees the
deleted pages, but everything afterwards does not.
However, it does not address the original bug report that started this
whole thing, [[bugs/bestlink_returns_deleted_pages]]. Because there
bestlink is run in the needsbuild hook. And that happens before backlink
calculation, and so bestlink still returns deleted pages then. Also in the
scan hook.
If bestlink needs to work consistently during those hooks, a more involved
fix will be needed.
This avoids unnecessary influences being recorded from pagespecs
such as "link(done) and bugs/*", when a page cannot ever possibly
match.
A pagespec term that returns a value without influence is an influence
blocker. If such a blocker has a false value (possibly due to being
negated) and is ANDed with another term, it blocks that term's influence
from propigating out.
If the term is ORed, or has a true value, it does not block influence.
(Consider "link(done) or bugs/*" and "link(done) and !nosuchpage")
In the implementation in merge_influence, I had to be careful to never
negate $this or $other when testing if they are an influence blocker,
since negation mutates the object. Thus the slightly weird if statement.
I made match_* functions whose influences can vary depending on the page
matched set a special "" influence to indicate this.
Then add_depends can try just one page, and if static influences are found,
stop there.
Thought of a cleaner way to accumulate all influences in
pagespec_match_list, using the pagespec_match result object as an
accumulator.
(This also accumulates all influences from failed matches, rather than just
one failed match. I'm not sure if the old method was correct.)
Benchmarking refresh of a a wiki with 25 thousand pages showed
file_pruned() using most of the time. But, when refreshing, ikiwiki already
knows about nearly all the files. So we can skip calling file_pruned() for
those it knows about. While tricky to do, this sped up a refresh (that
otherwise does no work) by 10-50%.