TeX has configuration options that prevent unsafe things like shell
escapes and insecure file reads/writes. Turn all of them on.
teximg's regex-based blacklist does not suffice. For instance:
[[!teximg code="""
\catcode`\%=0
%input{/etc/passwd}
"""]]
Remove the blacklist, since the TeX configuration options seal off the
underlying mechanisms more safely, and the blacklist blocks other TeX
commands that can prove useful.
Although imagemagick handles even really large sizes sanely, using a page
file, doing so would just waste time and disk space, since the browser
can be told to resize it larger.
checkconfig can run more than once in a single ikiwiki run if setup is
building wrappers. That clobbered the origsub value for bestlink, leading
to infinite recursion
It's not "exact" since case munging has to be done, and I think
"simple" captures the optimisation better.</pedant>
With apologies to smcv, who probably has to rebuild his wiki now.
Let E be the number of dependencies per page of the form "A depends on B and
nothing else", let D be the number of other dependencies per page,
let P be the total number of pages, and let C be the number of changed
pages in a refresh.
This patch should speed up a refresh from O(E*C*P + D*C*P) to
O(C + E*P + D*C*P), assuming that hash lookups are O(1).
In practice, plugins like inline and map produce a lot of these very simple
dependencies, and my album plugin's combination of inline with a large
number of pages causes it to suffer particularly badly.
In testing on a wiki with about 7000 objects (3500 full pages, 3500
images), a full rebuild continued to take about 5:30, and a refresh
after touching about 350 pages and 350 images reduced from 5:30 to 1:30.
As with my previous optimizations, this change will result in downgrades not
working correctly until the wiki is rebuilt.
This is unnecessary and just slows us down (by a factor of 2, in the
pessimal case where every page has an inline with pagenames); it's also
not possible to optimize it into add_depends_exact calls.
Set rootpage to the non-l10n'd rootpage parameter if it is set,
else to the masterpage of the linking page.
Signed-off-by: intrigeri <intrigeri@boum.org>
The po plugin's injected bestlink must do something special when called by this
exact part of inline's code.
Signed-off-by: intrigeri <intrigeri@boum.org>
... else, the recentchanges page shows a link such as "sandbox.es". But,
clicking on it goes to the English (or negotiated language) version of the page.
It is better in this one case if the link goes direct to the translated version
of the page.
(cherry picked from commit 496e8523c6)
... else, the recentchanges page shows a link such as "sandbox.es". But,
clicking on it goes to the English (or negotiated language) version of the page.
It is better in this one case if the link goes direct to the translated version
of the page.
This should be more efficient than pagespec_match_list since it short-circuits
after the first match is found.
The other problem with using pagespec_match_list here is it may throw an
error if a bad or failing pagespec somehow got into the dependencies.
The new dependency handling works better (eliminates more duplicates) if
dependencies are split up. On the same wiki mentioned in the previous
commit, this saves about a second (i.e. 4%) on the same test.
On a large wiki you can spend a lot of time reading through large lists
of dependencies to see whether files need to be rebuilt (album, with its
one-page-per-photo arrangement, suffers particularly badly from this).
The dependency list is currently a single pagespec, but it's not used like
a normal pagespec - in practice, it's a list of pagespecs joined with the
"or" operator.
Accordingly, change it to be stored as a list of pagespecs. On a wiki
with many tagged photo albums, this reduces the time to refresh after
`touch tags/*.mdwn` from about 31 to 25 seconds.
Getting the benefit of this change on an existing wiki requires a rebuild.
can evaluate them, check them in the wrapper right off the bat.
This doesn't prevent the deadlock in web commits that need to cvs
add directories, but I'm committing so Joey can take a look if he
wants.
This is both faster, and propigates any error in processing the feedpages
pagespec out to display on the page. Which may have been why I didn't use
it before, but currently seems like a good thing to do, since it explains
why your feeds are empty..
If a page is taken from the underlay, and one of the specified languages
does not have po files in the underlay, it would create a broken link
to the translated version of the page for that language.
With this change, there's no broken link.
I think the N/A was not intended to be visible, but it can show up as the
percent translated to a language. This happens if the page is located in an
underlay, and not translated to the language in any other underlay.
Previously, [[!meta redir="foo"]] on bar, where bar/foo exists, would
depend on "foo" (which matches nothing, probably) rather than "bar/foo".
(cherry picked from commit f27ec09b72f886415e63fe394e18d9c3cb3913bf)
Previously, [[!img bar.jpg]] on foo, where foo/bar.jpg exists, would
get a dependency equivalent to "glob(bar.jpg)" (which might not match
anything), rather than the correct "glob(foo/bar.jpg)".
(cherry picked from commit 85b2ec49ecd12dd23e5c432933457a72744ce7cb)
During backlink calulation, all links are examined and broken links can
be detected for free, so store a list of broken links and have brokenlinks
use it.
Exposing the %brokenlinks structure is a bit ugly, but the speedup seems
worth it: Around 1 second for wikis the size of the doc wiki that use
brokenlinks.
By adding this setting, we get both more configurability, and a minor
optimisation too, since gettext does not need to be called continually
to get the Discussion value.
This was impressively broken. add_depends was being called with params
backwards, and on parameter was set to the name of the generated
file, which isn't in the source.
Now updates to images will update the page that contains them, thus
updating them. This is unncessary for fullsize images, so skipped.
Serving up images etc. as text/plain; charset=utf-8 is unlikely to work
very well, and there's no point in having this CGI action for attachments
(since they're copied into the output as-is anyway).
In order to support translated basewiki and other underlays, we need
support for mo files in underlays.
The code did not allow this before, because if a mo file was in an
underlay, then it might try to update it, and its pot, and write to the
underlay, which is guaranteed to either fail due to permissions, or be
undesirable.
To fix, my approach is to just detect if a mo or pot file that is about to
be updated is in an underlay, and skip updating it. This seems to work
well:
- If the mo is out of date in the underlay, it won't get updated, but this
would probably be due to a problem in the underlay, or more likely,
the wiki is being rebuilt and so it *thinks* the mo is out of date,
but it's really not (and it would be a waste of time to rebuild it
anyway).
- If a page from the basewiki is edited, it is saved to the srcdir,
which causes generation of an updated mo and pot also in the srcdir;
the underlay stops being used for that page, and everything seems
to work.
Note that I am not including an underlay search directory for pot files.
They *seem* to be unnecessary for the underlay, since the mo files
in there never need to be updated.
These are for use by wikis where the primary language is not English.
On such a wiki, it makes sense to use an underlay has the source for pages
in the native language.
On various sites I have two IkiWiki instances running from the same
repository: one accessible via http and only accepting openid logins,
and one accessible via authenticated https and only accepting httpauth.
The https version should still pretty-print OpenIDs seen in git history,
even though it does not itself accept OpenID logins.
openiduser previously used a constructor that no longer works in 2.x.
However, all we actually want is the (undocumented) DisplayOfURL function
that is invoked by the display method, so try to use that.
(cherry picked from commit c3dd0ff5c7c10743107f203a5b456fdcd1b171df)
If given instead of pages, this is interpreted as a space-separated
list of links to pages (with the same LinkingRules as in a WikiLink),
and they are inlined in exactly the order given. The sort and pages
parameters cannot be used in conjunction with this one.
Besides being wrong to do, this could lead to the wrong item
being expired, as follows: If B is added and at the same time
A is changed, then A's ctime may be set to the current time,
while B's is set to its creation time. Thus the new item, A,
is incorrectly removed as older.
(This interacted especially badly with the bug fixed by
90b4d079605b72bb50d1da41402d994960e10937.)
The aggregate state merge code neglected to merge changes to the md5
field of an item. Therefore, if an item's md5 changed after initial
aggregation, it would be updated, and rewritten, each time thereafter.
This was wasteful and indirectly led to some expire problems.
This reverts commit 2ed033a4aa.
It has been more properly fixed in upstream's master, that will be merged in
immediately.
Signed-off-by: intrigeri <intrigeri@boum.org>
Rationalle: Comments need to be user-editable so that they can be posted
via git commit etc.
The _comment directive is still supported, for back-compat.
Setting up a new highlighter object is slightly expensive since it
reads and parses the langfile each time. So cache them.
This also speeds up ext2langfile by avoiding it needing to check for the
existence of a language file in some cases.
format: Provide a htmlizefallback hook that other plugins can use to
handle formats that are not suitable for general-purpose htmlize hooks.
highlight: Use the hook to allow formatting of any language/extension,
without it needing to be enabled for standalone source files.
highlight: If the highlight perl binding is not available, fallback
safely to a passthrough mode.
We build an array of [ plugin name, long name ] pairs, where long name
is an optional argument to hook(). So, a syntax plugin could define
long "friendly" name, such as "Markdown" instead of mdwn, and we would
then pass this array to formbuilder to populate the drop-down on the
edit page.
This is sorta an optimisation, and sorta a bug fix. In one
test case I have available, it can speed a page build up from 3
minutes to 3 seconds.
The root of the problem is that $links{$page} contains arrays of
links, rather than hashes of links. And when a link is found,
it is just pushed onto the array, without checking for dups.
Now, the array is emptied before scanning a page, so there
should not be a lot of opportunity for lots of duplicate links
to pile up in it. But, in some cases, they can, and if there
are hundreds of duplicate links in the array, then scanning it
for matching links, as match_link and some other code does,
becomes much more expensive than it needs to be.
Perhaps the real right fix would be to change the data structure
to a hash. But, the list of links is never accessed like that,
you always want to iterate through it.
I also looked at deduping the list in saveindex, but that does
a lot of unnecessary work, and doesn't completly solve the problem.
So, finally, I decided to add an add_link function that handles deduping,
and make ikiwiki-transition remove the old dup links.
When finding the pageurl, it was calling bestlink unnecessarily.
Since at this point $page contains the full name of the page that
is being inlined, there is no need to do bestlink's scan
for it.
This is only a minor optimisation, since bestlink is only called
once per displayed, inlined page.
This reverts commit 2f96c49bd1.
I forgot about internal pages. We don't want * matching them!
I left the optimisation in pagecount, where it used to live.
Internal pages probably don't matter when they're just being
counted.
I forgot to check if it was called from preprocess, and it is
not; it's called by a format hook. If an error is thrown from
a format hook, wiki build fails, so we don't want that.
* pagespec_match_list: New API function, matches pages in a list
and throws an error if the pagespec is bad.
* inline, brokenlinks, calendar, linkmap, map, orphans, pagecount,
pagestate, postsparkline: Display a handy error message if the pagespec
is erronious.
* Add IkiWiki::ErrorReason objects, and modify pagespecs to return
them in cases where they fail to match due to a configuration or syntax
error.
* inline: Display a handy error message if the inline cannot display any
pages due to such an error.
This is perhaps somewhat incomplete, as other users of pagespecs do not
display the error, and will eventually need similar modifications to inline.
I should probably factor out a pagespec_match_all function and make it throw
ErrorReasons.
The munged ids were looking pretty nasty, and were not completly guaranteed
to be unique. So a md5sum seems like a better approach. (Would have used
sha1, but md5 is in perl core.)
Well, that was a PITA.
Luckily, this doesn't break guids to comments in rss feeds,
though it does change the links.
I haven't put in a warning about needing to rebuild to get
this fix. It's probably good enough for new comments to get the
fix, without a lot of mass rebuilding.
It would be better to use urlto() here, but will_render
has not yet been called on the feed files at this point, so
it won't work. (And reorganizing so it can be is tricky.)
I guess what's happening here is that since the name
is passed to git via an environment variable, perl's normal
utf-8 IO layer stuff doesn't work. So we have to explicitly
decode the string from perl's internal representation into
utf-8.
This change was introduced in 85f865b5d9 and
c3af3840a2 ; it may be necessary for the meta-po
integration, but the po branch alone is supposed to work without it.
Signed-off-by: intrigeri <intrigeri@boum.org>
This makes wikis such as zack's much faster in the scan pass.
In that pass, when a template contains an inline, there is no reason to
process the entire inline and all its pages. I'd forgotten to pass
along the flag to let preprocess() know it was in scan mode, leading to
much unncessary churning.
- In 3.05, ikiwiki began expanding templates in scan mode,
for annoying, expensive, but ultimatly necessary reasons
of correctness.
- Smiley processing has a bug: It inserts a span for the smiley,
and then continues searching forward in the content for more,
starting at $end_of_smiley+1. Which means it searches for smilies
in the span too! And if it somehow finds one, we get an infinite loop
here.
- This bug can, probably, only be tickled if a htmllink to
show the smiley fails, because the smiley file doesn't exist,
or because ikiwiki doesn't know about it. In that case,
a link will be inserted to _create_ the missing page,
and that link will include the smiley inside the <a></a>.
- When a template is expanded in scan mode, and it contains
an inline, the sanitize hook is run during scan mode,
which never happened before. That causes the smiley processor
to run, before ikiwiki is, necessarily, aware that all
the smiley files exist (depending on scan order). So
it inserts creation links for them, and triggers the bug.
I've put in the simple fix of jumping forward past the inserted
span, and it does fix the problem. I will need to look in a bit
more detail into why an inline nested inside a template is
fully expanded during the scan pass -- that really shouldn't
be necessary, and it makes things much slower than they need
to be.