In order to support translated basewiki and other underlays, we need
support for mo files in underlays.
The code did not allow this before, because if a mo file was in an
underlay, then it might try to update it, and its pot, and write to the
underlay, which is guaranteed to either fail due to permissions, or be
undesirable.
To fix, my approach is to just detect if a mo or pot file that is about to
be updated is in an underlay, and skip updating it. This seems to work
well:
- If the mo is out of date in the underlay, it won't get updated, but this
would probably be due to a problem in the underlay, or more likely,
the wiki is being rebuilt and so it *thinks* the mo is out of date,
but it's really not (and it would be a waste of time to rebuild it
anyway).
- If a page from the basewiki is edited, it is saved to the srcdir,
which causes generation of an updated mo and pot also in the srcdir;
the underlay stops being used for that page, and everything seems
to work.
Note that I am not including an underlay search directory for pot files.
They *seem* to be unnecessary for the underlay, since the mo files
in there never need to be updated.
These are for use by wikis where the primary language is not English.
On such a wiki, it makes sense to use an underlay has the source for pages
in the native language.
On various sites I have two IkiWiki instances running from the same
repository: one accessible via http and only accepting openid logins,
and one accessible via authenticated https and only accepting httpauth.
The https version should still pretty-print OpenIDs seen in git history,
even though it does not itself accept OpenID logins.
openiduser previously used a constructor that no longer works in 2.x.
However, all we actually want is the (undocumented) DisplayOfURL function
that is invoked by the display method, so try to use that.
(cherry picked from commit c3dd0ff5c7c10743107f203a5b456fdcd1b171df)
If given instead of pages, this is interpreted as a space-separated
list of links to pages (with the same LinkingRules as in a WikiLink),
and they are inlined in exactly the order given. The sort and pages
parameters cannot be used in conjunction with this one.
Besides being wrong to do, this could lead to the wrong item
being expired, as follows: If B is added and at the same time
A is changed, then A's ctime may be set to the current time,
while B's is set to its creation time. Thus the new item, A,
is incorrectly removed as older.
(This interacted especially badly with the bug fixed by
90b4d079605b72bb50d1da41402d994960e10937.)
The aggregate state merge code neglected to merge changes to the md5
field of an item. Therefore, if an item's md5 changed after initial
aggregation, it would be updated, and rewritten, each time thereafter.
This was wasteful and indirectly led to some expire problems.
This reverts commit 2ed033a4aa.
It has been more properly fixed in upstream's master, that will be merged in
immediately.
Signed-off-by: intrigeri <intrigeri@boum.org>
Rationalle: Comments need to be user-editable so that they can be posted
via git commit etc.
The _comment directive is still supported, for back-compat.
Setting up a new highlighter object is slightly expensive since it
reads and parses the langfile each time. So cache them.
This also speeds up ext2langfile by avoiding it needing to check for the
existence of a language file in some cases.
format: Provide a htmlizefallback hook that other plugins can use to
handle formats that are not suitable for general-purpose htmlize hooks.
highlight: Use the hook to allow formatting of any language/extension,
without it needing to be enabled for standalone source files.
highlight: If the highlight perl binding is not available, fallback
safely to a passthrough mode.
We build an array of [ plugin name, long name ] pairs, where long name
is an optional argument to hook(). So, a syntax plugin could define
long "friendly" name, such as "Markdown" instead of mdwn, and we would
then pass this array to formbuilder to populate the drop-down on the
edit page.
This is sorta an optimisation, and sorta a bug fix. In one
test case I have available, it can speed a page build up from 3
minutes to 3 seconds.
The root of the problem is that $links{$page} contains arrays of
links, rather than hashes of links. And when a link is found,
it is just pushed onto the array, without checking for dups.
Now, the array is emptied before scanning a page, so there
should not be a lot of opportunity for lots of duplicate links
to pile up in it. But, in some cases, they can, and if there
are hundreds of duplicate links in the array, then scanning it
for matching links, as match_link and some other code does,
becomes much more expensive than it needs to be.
Perhaps the real right fix would be to change the data structure
to a hash. But, the list of links is never accessed like that,
you always want to iterate through it.
I also looked at deduping the list in saveindex, but that does
a lot of unnecessary work, and doesn't completly solve the problem.
So, finally, I decided to add an add_link function that handles deduping,
and make ikiwiki-transition remove the old dup links.
When finding the pageurl, it was calling bestlink unnecessarily.
Since at this point $page contains the full name of the page that
is being inlined, there is no need to do bestlink's scan
for it.
This is only a minor optimisation, since bestlink is only called
once per displayed, inlined page.
This reverts commit 2f96c49bd1.
I forgot about internal pages. We don't want * matching them!
I left the optimisation in pagecount, where it used to live.
Internal pages probably don't matter when they're just being
counted.
I forgot to check if it was called from preprocess, and it is
not; it's called by a format hook. If an error is thrown from
a format hook, wiki build fails, so we don't want that.
* pagespec_match_list: New API function, matches pages in a list
and throws an error if the pagespec is bad.
* inline, brokenlinks, calendar, linkmap, map, orphans, pagecount,
pagestate, postsparkline: Display a handy error message if the pagespec
is erronious.
* Add IkiWiki::ErrorReason objects, and modify pagespecs to return
them in cases where they fail to match due to a configuration or syntax
error.
* inline: Display a handy error message if the inline cannot display any
pages due to such an error.
This is perhaps somewhat incomplete, as other users of pagespecs do not
display the error, and will eventually need similar modifications to inline.
I should probably factor out a pagespec_match_all function and make it throw
ErrorReasons.
The munged ids were looking pretty nasty, and were not completly guaranteed
to be unique. So a md5sum seems like a better approach. (Would have used
sha1, but md5 is in perl core.)
Well, that was a PITA.
Luckily, this doesn't break guids to comments in rss feeds,
though it does change the links.
I haven't put in a warning about needing to rebuild to get
this fix. It's probably good enough for new comments to get the
fix, without a lot of mass rebuilding.
It would be better to use urlto() here, but will_render
has not yet been called on the feed files at this point, so
it won't work. (And reorganizing so it can be is tricky.)
I guess what's happening here is that since the name
is passed to git via an environment variable, perl's normal
utf-8 IO layer stuff doesn't work. So we have to explicitly
decode the string from perl's internal representation into
utf-8.
This change was introduced in 85f865b5d9 and
c3af3840a2 ; it may be necessary for the meta-po
integration, but the po branch alone is supposed to work without it.
Signed-off-by: intrigeri <intrigeri@boum.org>
This makes wikis such as zack's much faster in the scan pass.
In that pass, when a template contains an inline, there is no reason to
process the entire inline and all its pages. I'd forgotten to pass
along the flag to let preprocess() know it was in scan mode, leading to
much unncessary churning.
- In 3.05, ikiwiki began expanding templates in scan mode,
for annoying, expensive, but ultimatly necessary reasons
of correctness.
- Smiley processing has a bug: It inserts a span for the smiley,
and then continues searching forward in the content for more,
starting at $end_of_smiley+1. Which means it searches for smilies
in the span too! And if it somehow finds one, we get an infinite loop
here.
- This bug can, probably, only be tickled if a htmllink to
show the smiley fails, because the smiley file doesn't exist,
or because ikiwiki doesn't know about it. In that case,
a link will be inserted to _create_ the missing page,
and that link will include the smiley inside the <a></a>.
- When a template is expanded in scan mode, and it contains
an inline, the sanitize hook is run during scan mode,
which never happened before. That causes the smiley processor
to run, before ikiwiki is, necessarily, aware that all
the smiley files exist (depending on scan order). So
it inserts creation links for them, and triggers the bug.
I've put in the simple fix of jumping forward past the inserted
span, and it does fix the problem. I will need to look in a bit
more detail into why an inline nested inside a template is
fully expanded during the scan pass -- that really shouldn't
be necessary, and it makes things much slower than they need
to be.
... as Joey suggested on todo/need_global_renamepage_hook
This hook is applied recursively to returned additional rename
hashes, so that it handles the case where two plugins use the hook:
plugin A would see when plugin B adds a new file to be renamed.
The full set of rename hashes can no longer be changed by hook functions, that
are only allowed to return any additional rename hashes it wants to add.
Rationale: the correct behavior of the recursion would be hard, if not
impossible, to define, if already considered pages were changing on the run.
Signed-off-by: intrigeri <intrigeri@boum.org>
This means that the underlay needs to have a wmd/wmd/wmd.js,
which is a trifle weird, but it isolates all the wmd stuff in a
single wmd subdirectory of the built wiki. The wmd/images creating
a toplevel images directory was particularly bad.
This is likely a misconfiguration and can cause login to fail as the
browser refuses the send the session cookie back over http.
Not entirely happy with putting the check where I did, since users have to
try to log in, and fail, to see the misconfiguration explained. But I could
not find a better place to put the check.
This is potentially expensive, but is necessary so that meta and tag
directives, and other links on templates affect the page using the template
reliably.
It no longer makes sense to keep these functions in editpage, because
serveral plugins now exist that use them, and users may want to disable
editpage, while leaving those plugins enabled.
Most notably, comments uses both functions, and it's entirely appropriate
to disable editpage but still want to have comments enabled.
Less likely, attachments, rename, and remove all use check_canedit -- but
it would be unusual indeed to want to use these w/o editpage.
Falls back to looking for shortcuts.mdwn for backwards compatabiity; there
probably exist wikis that have changed the pageext but still use
shortcuts.mdwn.
See [[bugs/Aggregated_Atom_feeds_are_double-encoded]]. By default,
XML::Atom outputs strings of UTF-8 bytes with the Perl UTF8 flag stripped
off, which IkiWiki assumes to be Latin-1 and re-encodes as UTF-8 on
output. XML::Feed does not currently (0.41-1) set the magic variable to
change this behaviour (I've filed a bug on CPAN), but IkiWiki can
usefully set the same variable as a workaround.
This may already work with other web servers that have copied apache's
interface, and it should be easy to add support to it for web servers that
use some other interface. So, make the name more general.
This redirects to the given page (or if none is given, the page parameter
given to the CGI), or displays an error with a create link if the page
doesn't exist.
... as my meta branch probably won't be merged before the po plugin is, contrary
to what I was originally supposing.
This implies removing the po_translation_status_in_links and
po_strictly_refresh_backlinks options.
Added a note to the TODO section to think of bringing these features back later,
as they really enhance user experience on a translatable wiki.
Signed-off-by: intrigeri <intrigeri@boum.org>
This is intended to solve Joey's concerns expressed on
http://ikiwiki.info/todo/need_global_renamepage_hook/, i.e. the need to make it
possible to use this hook from external plugins.
A plugin using this hook still can add/modify/remove elements of the
@torename array.
Signed-off-by: intrigeri <intrigeri@boum.org>
... that was removed in 68869d664b
Without this scalar, a two-cells array is passed to $template->param, which
builds a hash with an odd number of elements.
Signed-off-by: intrigeri <intrigeri@boum.org>
After some thinking about it, I can't find why the type of a page being created
in the CGI could be restricted to po. So the previous case seems enough.
Signed-off-by: intrigeri <intrigeri@boum.org>
check_canremove/canrename is called only for its side effect (of failing if
removal is not allowed), its return value is never used and returning
something makes that unclear
use is file-scoped so warnings and strict are already enabled
inside the second package, and IkiWiki is already loaded
(though not imported into this context)
It was calling format hooks for each comment on the page.
When relativedate is enabled, that made it insert <script> tags
for each comment. And the browser loaded the same script over and over,
which was slow on its own. But that was nothing compared to running
the onload even over and over.. especially since the hook system
added a new call to the hook each time it loaded.
For a page with 10 comments, that caused the relativedate DOM parsing
code to run 1000 times, I think. Anyway, it was sloow. Now it runs once.
if suitable alternate text is unknown, then it should not be given.
empty alt text is suitable mainly for purely decorative images.
(cherry picked from commit 3cd7f67f0cf894f4fd5ba16f68e82e4f7bdbfdc5)
Always pass the full (modified) content in `content` named parameter. When the
user edits an existing wiki page, also pass a `diff` named parameter, which
includes only the lines that they added to the page, or modified.
Signed-off-by: intrigeri <intrigeri@boum.org>
Some aggregators, like Planet, sort by mtime rather than ctime. This
means that posts with modified content come to the top (which seems odd
to me, but is presumably what the aggregator's author or operator
wants), but it also means that posts with insignificant edits (like
adding tags) come to the top too. Atom defines <updated> to be the date
of the last *significant* change, so it's fine that ikiwiki defaults to
using the mtime, but it would be good to have a way for the author to
say "that edit was insignificant, don't use that mtime".
That resulted in double encoded display when using perl's stub
readline module. Apparently that module unconditionally upgrades
text to utf8, in a quite braindead way.
(Term::ReadLine::Gnu::Perl worked ok.)
Use mtn for monontone and hg for mercurial. The long names cause ugly
formatting in recentchanges, which has CSS that only allows a few
characters for the commit type column.
All meta titles are first extracted at scan time, i.e. before we turn
PO files back into translated markdown; escaping of double-quotes in
PO files breaks the meta plugin's parsing enough to save ugly titles
to %pagestate at this time.
Then, at render time, every page's passes on row through the Great
Rendering Chain (filter->preprocess->linkify->htmlize), and the meta
plugin's preprocess hook is this time in a position to correctly
extract the titles from slave pages.
This is, unfortunately, too late: if the page A, linking to the page B,
is rendered before B, it will display the wrongly-extracted meta title
as the link text to B.
On the one hand, such a corner case only happens on rebuild: on
refresh, every rendered page is fixed to contain correct meta titles.
On the other hand, it can take some time to get every page fixed.
We therefore re-render every rendered page after a rebuild to fix them
at once. As this more or less doubles the time needed to rebuild the
wiki, we do so only when really needed.
Signed-off-by: intrigeri <intrigeri@boum.org>
It will set up an ikiwiki instance tuned for use in blogging.
As part of this change, move the example sites into /usr/share/ikiwiki so
they are available even if docs are not installed.
Asking for only the head worked in my tests, but I've found a site where it
didn't -- apparently ikiwiki didn't get a chance to do or finish the
refresh when HEADed. Getting the whole url, waiting for ikiwiki to finish,
avoided the update problem.
* repolist: New plugin to support the rel=vcs-* microformat.
* goodstuff: Include repolist by default. (But it does nothing until
configured with the repository locations.)
Form validation works, but after trying to save invalid PO content, the user is
brought back to the page he/she was editing, without any single clue to explain
why it was not saved. The dedicated cansave hook is thus necessary.
Signed-off-by: intrigeri <intrigeri@boum.org>
This has to be done after the rename/remove plugins have added
their buttons, so we set this hook to be run last.
The canrename/canremove hooks already ensure this is forbidden
at the backend level, so this is only UI sugar.
Signed-off-by: intrigeri <intrigeri@boum.org>
The main reason to do so is to bypass the "favor the type of linking page on
page creation" logic, which is unsuitable when a broken link is clicked on
a slave (PO) page.
Signed-off-by: intrigeri <intrigeri@boum.org>
This is not needed by the use I'm doing of it, but seems more consistent to me.
Future users of this hook may need this data to make their mind.
Signed-off-by: intrigeri <intrigeri@boum.org>
... so that nicepagetitle hook's effects, such as translation status displayed
in links, are updated when the linked page changes.
The replacement of 'my %backlinks' with 'our %backlinks' in Render.pm made this
work: previously, every postscan hook was called with an almost empty
%backlinks, which defeated all my attempts to implement this feature.
This feature hits performance a bit. Its cost was quite small in my real-world
use-cases (a few percents bigger refresh time), but could be bigger in worst
cases. Time will tell.
NB: this hack could also be used by my meta branch. It may even be a ikiwiki
optional feature.
Signed-off-by: intrigeri <intrigeri@boum.org>
Thanks to the new rename hook behaviour, the whole renaming work is now done
by the rename plugin, and we don't need to remember which pages were renamed.
inline has a format hook that is an optimisation hack. Until this hook
runs, the inlined content is not present on the page. This can prevent
other format hooks, that process that content, from acting on inlined
content. In bug ##509710, we discovered this happened commonly for the
embed plugin, but it could in theory happen for many other plugins (color,
cutpaste, etc) that use format to fill in special html after sanitization.
The ordering was essentially random (hash key order). That's kinda a good
thing, because hooks should be independent of other hooks and able to run
in any order. But for things like inline, that just doesn't work.
To fix the immediate problem, let's make hooks able to be registered as
running "first". There was already the ability to make them run "last".
Now, this simple first/middle/last ordering is obviously not going to work
if a lot of things need to run first, or last, since then we'll be back to
being unable to specify ordering inside those sets. But before worrying about
that too much, and considering dependency ordering, etc, observe how few
plugins use last ordering: Exactly one needs it. And, so far, exactly one
needs first ordering. So for now, KISS.
Another implementation note: I could have sorted the plugins with
first/last/middle as the primary key, and plugin name secondary, to get a
guaranteed stable order. Instead, I chose to preserve hash order. Two
opposing things pulled me toward that decision:
1. Since has order is randomish, it will ensure that no accidental
ordering assumptions are made.
2. Assume for a minute that ordering matters a lot more than expected.
Drastically changing the order a particular configuration uses could
result in a lot of subtle bugs cropping up. (I hope this assumption is
false, partly due to #1, but can't rule it out.)
People seem to be able to expect to enter www.foo.com and get away with it.
The resulting my.wiki/www.foo.com link was not ideal.
To fix it, use URI::Heuristic to expand such things into a real url. It
even looks up hostnames in the DNS if necessary.
A new ikiwiki-transition moveprefs subcommand can pull the old data out of
the userdb and inject it into the setup file.
Note that it leaves the old values behind in the userdb too. I did this
because I didn't want to lose data if it fails writing the setup file for
some reason, and the old data in the userdb will only use a small amount of
space. Running the command multiple times will mostly not change anything.
This leads to better display for OpenIDs like smcv.pseudorandom.co.uk
and thm.id.fedoraproject.org (to take a couple of examples from the
IkiWiki commit history).
None of the comment state needs to be stored through the a later run of
ikiwiki, so move it all from pagestate to a more transient storage.
This is assuming that we'll never want to add pagespecs to search against
the comment state. Pagespecs like author() are why the meta plugin does
store its meta data in pagestate -- the data can be needed later to match
against.
The thinking here is that having both a Discussion page and comments for
the same page is redundant, and certianly not what you want if you enable
comments for a page. At first I considered making configurable via pagespec
what pages got discussion links. But that would mean testing a new pagespec
for every page, and a redundant config setting to keep in sync. So intead,
take a lead from my previous change to make inlined pages have a comments
link, and change the discussion link at the top of regular pages to link to
their comments.
(Implementation is a bit optimised to avoid redundant pagespec checking.)
Jumping to the just posted comment was the imputus, but I killed a number
of birds here.
Added a INLINEPAGE template variable, which can be used to add anchors to
any inline template.
To keep that sufficiently general, it is the full page name, so the
comment anchors and links changed form.
Got rid of the FIXMEd hardcoded html anchor div.
More importantly, the anchor is now to the very top of the comment, not the
text below. So you can see the title, and how it attributes you.
Avoid changing the permalink of pages that are not really comments, but
happen to contain the _comment directive. I think that behavior was a bug,
though not a likely one to occur since _comment should only really be used
on comment pages.
I think it is clearer to have one pagespec that controls all pages with
comments, and a separate pagespec that can be used to close new comments on
a subset of those pages.
Not compacting whitespace is the most important one: now that we run
sanitize hooks on individual posted comments in the comments plugin,
whitespace that is significant to Markdown (but not HTML) is lost.
(cherry picked from commit cb5aaa3cee)
The [[!_comment]] directive is a serialization format, not something for
presentation to users, so we should use the least ambiguous possible
representation.
This delays all comment formatting until the last possible time, allows
us to set metadata without worrying that commenters may be able to evade
it, and means that changes to how a comment is saved can be handled
gracefully. It also gives us somewhere to put the commenter's username
or IP address for later reference.
Not compacting whitespace is the most important one: now that we run
sanitize hooks on individual posted comments in the comments plugin,
whitespace that is significant to Markdown (but not HTML) is lost.
This should ensure that users can't "break out" from the enclosing
<div>, making it impossible to forge comments (assuming htmlscrubber
is enabled, and so is either htmlbalance or htmltidy).
wikilinks are harmless, so we might as well allow them.
Access control for this plugin is a bit odd, since we specifically
don't want to allow comments to be edited - so the check is whether the
user is allowed to edit a deliberately invalid page name,
page/commented/on[smcvpostcomment]. You can put smcvpostcomment(*)
or smcvpostcomment(some/subdir/*) in $config{anonok_pagespec}
or the opposite in $config{locked_pages} to allow "editing" (really
just posting) comments.
I wanted this nearer to the top, but decided to put it after the
add_depends. Reasoning: It's possible with a combinaton of feedpages and
show options to make @list and @feedlist contain completly differing sets
of pages. We want to add_depends all pages in both sets. We could combine
the two lists and add_depends that, but it's slightly more efficient to
defer reducing @feedlist, and add_depends whichever list is longer.
This is a skeleton that does nothing yet.
See the comments in the code for an overview of the issue that arises, due to
the renamepage hook never being called globally.
Signed-off-by: intrigeri <intrigeri@boum.org>
Not implemented yet, 'cos the renamepage hook has to come first.
Else translations would be deleted on rename, what a shame.
Signed-off-by: intrigeri <intrigeri@boum.org>
... instead of already existing ones.
This fixes the "missing otherlanguages links on master pages just created via
the CGI" bug.
Signed-off-by: intrigeri <intrigeri@boum.org>
And enjoy a 10% rebuild time enhancement on a complex wiki full of maps and
other pseudo-dynamic content, with some other costly plugins enabled. So it
could well mean 20% on a more usual wiki.
Signed-off-by: intrigeri <intrigeri@boum.org>
This way, the po plugin will not appropriate PO files it is not responsible for,
and PO files existing before this plugin was enabled can coexist peacefully with
our own ones.
Signed-off-by: intrigeri <intrigeri@boum.org>
(I just removed in istranslation and _istranslation the dependency on
istranslatable... which broke things in a subtle way, hard to see at the first
glance.)
Signed-off-by: intrigeri <intrigeri@boum.org>
This is necessary so that things that fork to the background,
like pinger, and inline ping, don't block other cgis from running.
Note that websetup also calls unlockwiki, before refreshing / rebuilding
the wiki. It makes perfect sense for that not to block other cgis.
Fixed by making the cgi wrapper wait on a cgilock.
If you had to set apache's MaxClients low to avoid ikiwiki thrashing
your server, you can now turn it up to a high value.
The downside to this is that a cgi call that doesn't need to call lockwiki
will be serialised by this so only one can run at a time. (For example,
do=search.) There are few such calls, and all of them call loadindex,
so each still eats gobs of memory, so serialising them still seems ok.
It has grown up incrementally and new helper functions were added right in the
middle of the hooks, most often near the place they were used, which is
practical when doing initial development, but quite ugly afterwards, when helper
functions are useful to separate logic and implementation details.
Today's refactoring commits have brought the code to a much more maintainable
state, IMHO.
Signed-off-by: intrigeri <intrigeri@boum.org>
This is not needed now that tagpage returns a page name starting with a
slash.
(Also fixes a minor bug that the edit links started with double slashes due
to the hack.)
It is now more elegant IMHO, and the output is now sorted according to the
language name (instead of code).
Signed-off-by: intrigeri <intrigeri@boum.org>
The very same code was repeated at dozens of places.
NB: the real work is now done is _istranslation(), which is memoized,
so the additional function calls overhead should be compensated.
Signed-off-by: intrigeri <intrigeri@boum.org>
... to prevent the use of Encode::Guess::guess_encoding() in
Locale::Po4a::Transtractor (just a minor security measure, dependent on po4a
internals, but we have no reason to think Encode::Guess is not safe).
Signed-off-by: intrigeri <intrigeri@boum.org>