Benchmarking refresh of a a wiki with 25 thousand pages showed
file_pruned() using most of the time. But, when refreshing, ikiwiki already
knows about nearly all the files. So we can skip calling file_pruned() for
those it knows about. While tricky to do, this sped up a refresh (that
otherwise does no work) by 10-50%.
This makes it more efficient.
It also fixes the same bug that I fixed in orphans recently,
that only changes to the set of displayed pages were considered (or amoung),
which missed changes to links on other pages to those.
Probably this bug was never noticed because pagestats is most often put
on a blog type page, which gets updated anyway when posts change,
and thus the tag cloud was updated.
This makes it more efficient.
It also fixes a longstanding bug, where if only a small set of pages were
considered by orphans, changes to links on other pages failed to cause an
update.
Here I was bitten by perl's aliasing of foreach variables
to the loop array contents, and match_link accidentially changed
the contents of %links.
In Jon's testcase, a tag added an absolute link, which was
made relative by the above bug, and then the link was added
again in preprocess, and turned into a duplicate.
I had assumed that an image shown full size did not need add_depends, since
a change would not need a change to the displaying page.
But this is not true if the image is modified and its size changes. Then
the page needs to update its img tag to reflect the current size.
If an image was resized smaller, with width and height specified to values
that did not fit its aspect ratio, the image tag with/height were not
adjusted to the actual size imagemagick chooses.
This was broken by 03449610d6.
To fix it right, it unfortunatly needs to always read the src image now,
in order to determine if the image is being displayed larger, or resized
smaller. When resized smaller, it then always uses the size of the
thumbnail, while for larger it calculates the size.
(Only way to get rid of this sometimes extra image read would be to change
it to not allow displaying images larger.)
I weakended the regexp, so this matches ipv6 addresses too. It does not
ensure that the address is valid, but that should not matter here.
Note that addresses ending in "::" are not matched, so eg, the unspecified
address will not match -- but should never appear here anyway.
TeX has configuration options that prevent unsafe things like shell
escapes and insecure file reads/writes. Turn all of them on.
teximg's regex-based blacklist does not suffice. For instance:
[[!teximg code="""
\catcode`\%=0
%input{/etc/passwd}
"""]]
Remove the blacklist, since the TeX configuration options seal off the
underlying mechanisms more safely, and the blacklist blocks other TeX
commands that can prove useful.
Although imagemagick handles even really large sizes sanely, using a page
file, doing so would just waste time and disk space, since the browser
can be told to resize it larger.
During backlink calulation, all links are examined and broken links can
be detected for free, so store a list of broken links and have brokenlinks
use it.
Exposing the %brokenlinks structure is a bit ugly, but the speedup seems
worth it: Around 1 second for wikis the size of the doc wiki that use
brokenlinks.
By adding this setting, we get both more configurability, and a minor
optimisation too, since gettext does not need to be called continually
to get the Discussion value.
This was impressively broken. add_depends was being called with params
backwards, and on parameter was set to the name of the generated
file, which isn't in the source.
Now updates to images will update the page that contains them, thus
updating them. This is unncessary for fullsize images, so skipped.
openiduser previously used a constructor that no longer works in 2.x.
However, all we actually want is the (undocumented) DisplayOfURL function
that is invoked by the display method, so try to use that.
(cherry picked from commit c3dd0ff5c7c10743107f203a5b456fdcd1b171df)
Besides being wrong to do, this could lead to the wrong item
being expired, as follows: If B is added and at the same time
A is changed, then A's ctime may be set to the current time,
while B's is set to its creation time. Thus the new item, A,
is incorrectly removed as older.
(This interacted especially badly with the bug fixed by
90b4d079605b72bb50d1da41402d994960e10937.)
The aggregate state merge code neglected to merge changes to the md5
field of an item. Therefore, if an item's md5 changed after initial
aggregation, it would be updated, and rewritten, each time thereafter.
This was wasteful and indirectly led to some expire problems.
The test suite was emitting a lot of ugly gettext warnings;
setting LC_ALL didn't solve the problem for all locale setups
(since ikiwiki remaps it to LANG, and ikiwiki didn't know about
the C locale).
People also seem generally annoyed by the messages when
Locale::Gettext is not installed, and I suspect will be
generally happier if it just silently doesn't localize.
The optimisation came about when I noticed that the gettext
sub was doing rather a lot of work each call just to see
if localisation is needed. We can avoid that work by caching,
and the best thing to cache is a version of the gettext sub
that does exactly the right thing.
This was slightly complicated by the locale setting,
which might need to override the original locale (or lack
thereof) after gettext has been called. So it needs to invalidate
the cache in that case. It used to do it via a global variable,
which I am happy to have also gotten rid of.
Do not allow an unterminated """ string to be treated as a series of bare
words. Fixes runaway regexp recursion/backtracking in strange situations.
(See 1d57a21c98 for test case.)