It's not "exact" since case munging has to be done, and I think
"simple" captures the optimisation better.</pedant>
With apologies to smcv, who probably has to rebuild his wiki now.
Let E be the number of dependencies per page of the form "A depends on B and
nothing else", let D be the number of other dependencies per page,
let P be the total number of pages, and let C be the number of changed
pages in a refresh.
This patch should speed up a refresh from O(E*C*P + D*C*P) to
O(C + E*P + D*C*P), assuming that hash lookups are O(1).
In practice, plugins like inline and map produce a lot of these very simple
dependencies, and my album plugin's combination of inline with a large
number of pages causes it to suffer particularly badly.
In testing on a wiki with about 7000 objects (3500 full pages, 3500
images), a full rebuild continued to take about 5:30, and a refresh
after touching about 350 pages and 350 images reduced from 5:30 to 1:30.
As with my previous optimizations, this change will result in downgrades not
working correctly until the wiki is rebuilt.
Now that dependencies are a list of pagespecs with an implicit "or"
operation, there's no need to try to merge pagespecs under normal use.
ikiwiki-transition contains the only use of the function, so move
it there rather than deleting it entirely (it's used to concatenate all
admins' lists of locked pages).
On a large wiki you can spend a lot of time reading through large lists
of dependencies to see whether files need to be rebuilt (album, with its
one-page-per-photo arrangement, suffers particularly badly from this).
The dependency list is currently a single pagespec, but it's not used like
a normal pagespec - in practice, it's a list of pagespecs joined with the
"or" operator.
Accordingly, change it to be stored as a list of pagespecs. On a wiki
with many tagged photo albums, this reduces the time to refresh after
`touch tags/*.mdwn` from about 31 to 25 seconds.
Getting the benefit of this change on an existing wiki requires a rebuild.
By adding this setting, we get both more configurability, and a minor
optimisation too, since gettext does not need to be called continually
to get the Discussion value.
On various sites I have two IkiWiki instances running from the same
repository: one accessible via http and only accepting openid logins,
and one accessible via authenticated https and only accepting httpauth.
The https version should still pretty-print OpenIDs seen in git history,
even though it does not itself accept OpenID logins.
The test suite was emitting a lot of ugly gettext warnings;
setting LC_ALL didn't solve the problem for all locale setups
(since ikiwiki remaps it to LANG, and ikiwiki didn't know about
the C locale).
People also seem generally annoyed by the messages when
Locale::Gettext is not installed, and I suspect will be
generally happier if it just silently doesn't localize.
The optimisation came about when I noticed that the gettext
sub was doing rather a lot of work each call just to see
if localisation is needed. We can avoid that work by caching,
and the best thing to cache is a version of the gettext sub
that does exactly the right thing.
This was slightly complicated by the locale setting,
which might need to override the original locale (or lack
thereof) after gettext has been called. So it needs to invalidate
the cache in that case. It used to do it via a global variable,
which I am happy to have also gotten rid of.
Do not allow an unterminated """ string to be treated as a series of bare
words. Fixes runaway regexp recursion/backtracking in strange situations.
(See 1d57a21c98 for test case.)
And avoid a whole class of potential security problems (though
none that I know of actually existing..), by avoiding
performing any string interpolation on user-supplied data when translating
pagespecs.
This is sorta an optimisation, and sorta a bug fix. In one
test case I have available, it can speed a page build up from 3
minutes to 3 seconds.
The root of the problem is that $links{$page} contains arrays of
links, rather than hashes of links. And when a link is found,
it is just pushed onto the array, without checking for dups.
Now, the array is emptied before scanning a page, so there
should not be a lot of opportunity for lots of duplicate links
to pile up in it. But, in some cases, they can, and if there
are hundreds of duplicate links in the array, then scanning it
for matching links, as match_link and some other code does,
becomes much more expensive than it needs to be.
Perhaps the real right fix would be to change the data structure
to a hash. But, the list of links is never accessed like that,
you always want to iterate through it.
I also looked at deduping the list in saveindex, but that does
a lot of unnecessary work, and doesn't completly solve the problem.
So, finally, I decided to add an add_link function that handles deduping,
and make ikiwiki-transition remove the old dup links.
This reverts commit 2f96c49bd1.
I forgot about internal pages. We don't want * matching them!
I left the optimisation in pagecount, where it used to live.
Internal pages probably don't matter when they're just being
counted.
* pagespec_match_list: New API function, matches pages in a list
and throws an error if the pagespec is bad.
* inline, brokenlinks, calendar, linkmap, map, orphans, pagecount,
pagestate, postsparkline: Display a handy error message if the pagespec
is erronious.
* Add IkiWiki::ErrorReason objects, and modify pagespecs to return
them in cases where they fail to match due to a configuration or syntax
error.
* inline: Display a handy error message if the inline cannot display any
pages due to such an error.
This is perhaps somewhat incomplete, as other users of pagespecs do not
display the error, and will eventually need similar modifications to inline.
I should probably factor out a pagespec_match_all function and make it throw
ErrorReasons.
The problem was introduced by the recent noextension patches.
Object autovivification caused junk to get into %htmlize,
and all keys of that showed up as page types.
Because getopt::long is used in passthrough mode, if a known
option like --wikiname that needs a parameter is specified w/o
the parameter, it will not be processed, and passed on through.
So in this case the "unknown option" message is innaccurate.
Make it slightly better by noting that the problem can be a missing
parameter.
This modification was initially done in editpage, in commit
a3726968bc, but was then lost while merging
upstream/master branch.
Signed-off-by: intrigeri <intrigeri@boum.org>
It no longer makes sense to keep these functions in editpage, because
serveral plugins now exist that use them, and users may want to disable
editpage, while leaving those plugins enabled.
Most notably, comments uses both functions, and it's entirely appropriate
to disable editpage but still want to have comments enabled.
Less likely, attachments, rename, and remove all use check_canedit -- but
it would be unusual indeed to want to use these w/o editpage.
It used to replace unknown functions with "0" when translating a pagespec.
Instead, replace it with a FailReason object. This way, the pagespec will
still evaluate as before (possibly successfully if other terminals exist),
but a human-readable error will be shown if the result is displayed.
Also, an empty pagespec used to be replaced with "0", to avoid a eval
error. Also use a FailReason here.
It seems to be a failing of i18n in unix that the translation stops at the
commands and the parameters to them, and ikiwiki is no exception with its
currently untranslated directives. So the little bit that's translated sticks
out like a sore thumb. It also breaks building of wikis if a different locale
happens to be set.
I suppose the best thing to do is either give up on the localisation of this
part completly, or make it recognise English in addition to the locale. I've
tenatively chosen the latter.
(Also accept 1 and 0 as input.)
inline has a format hook that is an optimisation hack. Until this hook
runs, the inlined content is not present on the page. This can prevent
other format hooks, that process that content, from acting on inlined
content. In bug ##509710, we discovered this happened commonly for the
embed plugin, but it could in theory happen for many other plugins (color,
cutpaste, etc) that use format to fill in special html after sanitization.
The ordering was essentially random (hash key order). That's kinda a good
thing, because hooks should be independent of other hooks and able to run
in any order. But for things like inline, that just doesn't work.
To fix the immediate problem, let's make hooks able to be registered as
running "first". There was already the ability to make them run "last".
Now, this simple first/middle/last ordering is obviously not going to work
if a lot of things need to run first, or last, since then we'll be back to
being unable to specify ordering inside those sets. But before worrying about
that too much, and considering dependency ordering, etc, observe how few
plugins use last ordering: Exactly one needs it. And, so far, exactly one
needs first ordering. So for now, KISS.
Another implementation note: I could have sorted the plugins with
first/last/middle as the primary key, and plugin name secondary, to get a
guaranteed stable order. Instead, I chose to preserve hash order. Two
opposing things pulled me toward that decision:
1. Since has order is randomish, it will ensure that no accidental
ordering assumptions are made.
2. Assume for a minute that ordering matters a lot more than expected.
Drastically changing the order a particular configuration uses could
result in a lot of subtle bugs cropping up. (I hope this assumption is
false, partly due to #1, but can't rule it out.)