pagespec_translate may set $@ if it fails to parse a pagespec, but
due to memoization, this is not reliable. If a memoized call is repeated,
and $@ is already set for some other reason previously, it will remain
set through the call to pagespec_translate.
Instead, just check if pagespec_translate returns undef.
To review, tcc does not really use environ, so you have to use clearenv
there. But POSIX, in their wisdom, didn't standardise clearenv yet,
so on FreeBSD, one still needs to manipulate environ on their own.
(If you use tcc on FreeBSD, this may leave you unsatisfied.)
Finally removed the last hardcoding of IkiWiki::Setup::Standard.
Take the first "IkiWiki::Setup::*" in the setup file to define the
setuptype, and remember that type to use in dumping later. (But it can be
overridden using --set, etc.)
Also, support setup file types that are not evaled.
The info is stored in the session database, not the user database.
There should be no reason to need it when a user is not logged in.
Also, hide the email field in the preferences page for openid users.
Note that the email and username are not yet actually used for anything.
The email will be useful for gravatar, while the username might be used
for a more pretty display of the openid.
* moderatedcomments: Added moderate_pagespec that can be used
to control which users or comment locations are moderated.
This can be used, just for example, to moderate http://myopenid.com/*
if you're getting a lot of spammers from one particular openid
provider (who should perhaps answer your emails about them),
while not moderating other users.
* moderatedcomments: The moderate_users setting is deprecated. Instead,
set moderate_pagespec to "!admin()" or "user(*)" instead.
This prevented comments containing some utf-8, including euro sign, from
being submitted. Since md5_hex is a C implementation, the string has to be
converted from perl's internal encoding to utf-8 when it is called. Some
utf-8 happened to work before, apparently by accident.
Note that this will change the checksums returned.
unique_comment_location is only used when posting comments, so the checksum
does not need to be stable there.
I only changed page_to_id for completeness; it is passed a comment page
name, and they can currently never contain utf-8.
In teximg, the bug could perhaps be triggered if the tex source contained
utf-8. If that happens, the checksum will change, and some extra work might
be performed on upgrade to rebuild the image.
This was not doable before, but when I added transitive dependency handling
in the big dependency rewrite, it became possible to include a comment
count when inlining.
This also improves the action link when a page has no comments. It will
link direct to the cgi to allow posting the first comment. And if the page
is locked to prevent posting new comments, the link is no longer shown.
When creating a page, multiple locations are tested to see if they can be
edited. If all fail, one of the failure subs is called, to log the user in
to allow them to proceed with the edit. So far so good.
But, what if some pages fail for one reason, and some for another? This
occurs when httpauth_pagespec is used in conjunction with signinedit (and
openid or something). When the user is not signed in at all
The former will fail to edit a page because the user was not httpauthed.
The latter will fail to edit a different page, because the user was not
signed in. One of their failure methods gets to run first.
The page creation code always ran the failure method corresponding to the
topmost page location. So, when editing a foo/Discussion page, and with
httpauth_pagespec => "*!/Discussion", it ran the httpauth failure method,
which was exactly the wrong thing to do.
I fixed this by making it instead run the failure method for the *best*
page location. In the above example, that's foo/Discussion, so signinedit
runs, as desired, and we get the signin page.
This seems like it will be the right choice, or at least an acceptable
choice. If a user wants to use httpauth they can always choose it on the
signin page.
My logic was right before. Cleaned up some code.
(Page creation is still a problem.)
Also, I removed the Edit url munging, because that is not
necessary with the canedit hook, since canedit will handle
redirection through cgiauthurl if necessary.
Now that openiduser is in IkiWiki core, it's ok to have passwordauth check
for it, and avoid displaying useless password fields when showing
preferences for an openid.
Also improved the styling of the display of the openid in the preferneces
page.
Use `_` to avoid superfluous stat.
Check for `defined $file`, instead of just `$file`.
Add spaces after commas.
Change return values of `verify_src_file()` to not return the tainted filename.
Rename `$f` to `$file_untainted in `verify_src_file()`.
$f changes to `$file` in `find_src_files()`.
This attempts to fix commit f3abeac919.
For discussion see
<http://ikiwiki.info/todo/auto-create_tag_pages_according_to_a_template/>
To make automatically added files render they have to be added to the $files,
$pages, $new, and $changed variables.
After that scan() is called on them.
if "tag_autocreate=1" is set in the configuration. The pages will be created in
tagbase, if and only if they do not exist in the srcdir yet. Tag pages will be create from
"autotag.tmpl".
At this stage a second refresh is needed for the tag pages to be rendered.
Add autotag.tmpl template.
Consider a template like:
[[!template type=note text="""
[[!inline pages="*foo*"]]
"""]]
The text parameter is htmlized before being passed into the template (in
case the template wraps it in a <span> that prevents markdown from
htmlizing it later).
But, when markdown sees "*foo*", it turns that into <em>foo</em>.
Later, when preprocessing the inline directive, that leads to suprising
results.
To fix this, I made template parameters be preprocessed (and filtered)
before being htmlized.
Note that I left in the preprocessing (and filtering) of the template
output at the end. That's still relevant when the template itself contains
preprocessor directives.
Note that there is an associated po4a warning when a page is empty:
Use of uninitialized value $file in substitution (s///) at /usr/share/perl5/Locale/Po4a/Text.pm line 205.
I've filed a bug with po4a about that, but the important thing is fixing
the crash here.
The new git-notes feature in git 1.6.6 changes git log output in a way that
broke ikiwiki's parser if notes are added to commits.
I decided to deal with this by disabling notes when ikiwiki uses git,
by setting GIT_NOTES_REF="". AFAICS, looking up notes when dumping logs
will only waste time, since it does not currently seem to make sense for
ikiwiki to do anything with the notes.
I noticed that chromium was not hyperlinking the areas in the object-based
linkmap, while img works ok. Dunno why, but img based is nicer anyway since
it is allowed right through the htmlscrubber with no workarounds.
This way users can use all the other alignment values when not including a
caption. Also, it will work without the standard style, and I don't have to
worry about regressions this way.
This is achieved by preparing CSS definitions that emulates the behavior
of the align attribute, and passing it to the outermost IMG wrapper
(A or TABLE) instead of passing the align value to IMG directly.
On second though, you might want a wide-open wiki with some locked
pages that cannot be edited online.
So, the right thing for lockedit to do when there are no auth plugins is
to just say the page cannot be edited.
Problem here was that no charset http header was being sent.
I fixed this globally by making cgi_custom_failure send the header.
Required changing its parameters.
The crux of the problem is that the cgi object has raw values not converted
to utf-8, and rename was using its fields. Also fixed a missed place where
the form object did not get its fields utf-8 encoded.
bestlink was looking at whether %links existed for a page in order to tell
if the page exists, but just-deleted pages still have entries in there (for
reasons it may be best not to explore). So bestlink would return
just-deleted pages. Instead, make bestlink use %pagesources.
Also, when finding a deleted page, %pagecase was not cleared of that page.
This, again, made bestlink return just-deleted pages. Now that is cleared.
Fixing bestlink exposed another issue though. The backlink calculation code
uses bestlink. So when a page was deleted, no backlinks to it are found,
and pages that really did backlink to it were not updated, and had broken
links.
To fix that, the code that actually removes deleted pages had to be split
out from find_del_files, so it can run a bit later. It is run just after
backlinks are calculated. This way, backlink calculation still sees the
deleted pages, but everything afterwards does not.
However, it does not address the original bug report that started this
whole thing, [[bugs/bestlink_returns_deleted_pages]]. Because there
bestlink is run in the needsbuild hook. And that happens before backlink
calculation, and so bestlink still returns deleted pages then. Also in the
scan hook.
If bestlink needs to work consistently during those hooks, a more involved
fix will be needed.
Speedup of about 25% for small inlines; could be much larger for inlines of
many, or complex pages.
Not bloating memory with excessive memoization data was the key to this.
The method chosen does not squeeze out every erg of speed possible when
inlines are nested, but that's rare. It uses less memory than other
optimisation hacks (I'm looking at you,
f937c1fb80 !) already used in inline.pm.
Unlike generic meta foo tags, meta description is known to be safe, so can
be special cased to be allowed despite the html scrubber. This makes meta
description much more useful, since it is otherwise limited to being used
by other plugins like map.
My experience is that when inlines are nested, the old behavior of
generating feeds for the nested inlines was never really desired. Since the
feeds were numbered sequentially, the numbers could easily change, and it did
not make sense to subscribe to or use those feeds. And generating those nested
feeds often meant a lot of unnecessary calculation, and data being written.
So, I dropped them.
Looking back, nested feeds originally were a free side effect of properly
handing multiple feeds on one page. Of course, that is still supported.
I chose not to have it override style.css, because style.css is not really
intended to be edited; the one from the underlay is intended to be used as
a base that local.css overrides.
I chose to use a plugin rather than changing the default behavior, both
because I didn't want to have to worry about possibly breaking backwards
compatability (though this seems unlikely), and because it seemed cleaner
to not include style template parameters in the main page template code.
I suppose someone might want a way to not override the toplevel
local.css, but instead include it as well as foo/local.css. Probably the
best way to do that would be to have foo/local.css @import ../local.css
(modulo browser compatability issues). Alternatively, edit page.tmpl
to always include the toplevel local.css, or swap out this plugin for
another one.
When redirecting to a page, ie, after editing, ensure that the url is
uri-encoded. Most browsers other than MSIE don't care, but it's the right
thing to do.
The known failure case involved editing a page that had utf-8 in the name
using MSIE.
Before, the htmllink would display the link to the template as if it were a
wikilink, but what was stored was not, which could lead to confusing
situations.
git log --follow seems to sometimes show merges from before the file was
ever created. So, skip them, a file shouldn't be first created during a
merge anyway.
This will be a bit more expensive, but --getctime does not need to be fast.
And getting the real creation time a very useful when untangling blog
histories that involve renames.
I made match_* functions whose influences can vary depending on the page
matched set a special "" influence to indicate this.
Then add_depends can try just one page, and if static influences are found,
stop there.
This was tricky because of the caching, and because use_pagespec always
adds a dependency. That would have made year calendars depend on the whole
pagespec, which is overly broad. So I removed the caching, format_month,
and in format_year just look at %pagesources to see if month pages are
available.
In format_month, I make it always call use_pagespec, so each month calendar
gets the right dependency and any influcences added. This means a bit more
work, but the added work is fairly minimal, and presence dependencies
remove a *lot* of work it used to do.
(100% untested!)
This dependency was missing before switching to use_pagespec.
It is correct to add it, but it needs to be combined with the regular
"pages" dependency to ensure that it does not match extra pages.
(Also fixed its dependency type.)
Benchmarking refresh of a a wiki with 25 thousand pages showed
file_pruned() using most of the time. But, when refreshing, ikiwiki already
knows about nearly all the files. So we can skip calling file_pruned() for
those it knows about. While tricky to do, this sped up a refresh (that
otherwise does no work) by 10-50%.
If a pagespec fails to match, I had been throwing the influences away, but
that is not right. Consider `backlink(foo)`, where foo does not exist.
It still needs to be added as an influence, because if it is created, it
will influence the pagespec to match.
But with that fix, `link(bar)` had as influences all pages, whether they
link to bar or not. Which is not necessary, because modifiying a page to
add a link to bar will directly cause the pagespec to match.
So, in match_link (and all the match_* functions for page metadata),
only return an influence if the match succeeds.
match_backlink had been implemented as the inverse of match_link, but that
is no longer completly true. While match_link does not return an influence
on failure, match_backlink does.
match_created_before/after also return the influence on failure, this way
if created_after(foo) currently fails because foo does not exist, it will
still update the page with the pagespec if foo is created.
This is very common, and the code has to test each type differently, since
the list of candidates to test, as well as the test, will vary per type.
Much happier with this code now.
This new method for determining when links on pages
have changed should be more efficient, since it avoids
double calculation of the bestlinks.
It also allows collecting data about the old links, before
the scan pass, so the data is accurate when pages move around
and bestlinks change.
Also got some code back to a saner indent level.
This makes it more efficient.
It also fixes the same bug that I fixed in orphans recently,
that only changes to the set of displayed pages were considered (or amoung),
which missed changes to links on other pages to those.
Probably this bug was never noticed because pagestats is most often put
on a blog type page, which gets updated anyway when posts change,
and thus the tag cloud was updated.
This makes it more efficient.
It also fixes a longstanding bug, where if only a small set of pages were
considered by orphans, changes to links on other pages failed to cause an
update.
Involved some code refactoring so that same code that detects
link changes for backlinks updating can be used for link dependency
checking. The nice thing is that link dep checking is thus
comopletly free!
Preliminary support, anyway.
If a dependency only includes DEPEND_EXISTS, then only changes that
involved adding or deleting a page can trigger it.
This is complicated by internal pages, since the code did not previously
differentiate between add, delete, and change of internal pages.
Now it tracks change separately from add+delete, so DEPEND_EXISTS pagespecs
that actually match internal pages (which will probably be quite rare in
practice) should work.
As soon as a change happens, we know we will need to rescan all
dependencies from the start, so bail out of the current scan partway to
avoid doing redundant work.
Only problem with this is that ikiwiki sometimes ends up printing out
dependencies that, while correct, are not obvious. Before:
building B, which depends on A
building C, which depends on A
building D, which depends on A
After:
building B, which depends on A
building C, which depends on B
building D, which depends on C
I had assumed that an image shown full size did not need add_depends, since
a change would not need a change to the displaying page.
But this is not true if the image is modified and its size changes. Then
the page needs to update its img tag to reflect the current size.
If an image was resized smaller, with width and height specified to values
that did not fit its aspect ratio, the image tag with/height were not
adjusted to the actual size imagemagick chooses.
This was broken by 03449610d6.
To fix it right, it unfortunatly needs to always read the src image now,
in order to determine if the image is being displayed larger, or resized
smaller. When resized smaller, it then always uses the size of the
thumbnail, while for larger it calculates the size.
(Only way to get rid of this sometimes extra image read would be to change
it to not allow displaying images larger.)
Through a complex chain of circumstances, that filtering was causing
dumpsetup to trigger undefined warning messages from the po plugin. But
anyway, munging the otl in htmlize is less error-prone and less expensive,
a win all around.
Loading and use of IkiWiki::Receive can all be pushed into the git plugin,
rather than scattered around.
I had at first wanted to make a receive plugin and move it there,
but a plugin was not a good fit; you don't want users to have to manually
load it, and making the git plugin load the receive plugin at the right
times would need more, and ugly code.
calls are warranted. They shouldn't modify the caller's working directory,
though. Use File::chdir to keep the scope of the changes subroutine-local.
The tests now pass without resetting the working directory.
* In Wrapper.pm, add a new hook "wrapperargcheck" to examine argc/argv
and return success or failure. In the failure case, the wrapper
terminates.
* In cvs.pm, implement the new hook to return failure if a directory is
being cvs added.
having to quote, and the possible use of the shell) sucks. Stop
passing args to cvs_runcvs() as an arrayref, since that also sucks
(and was a sop to IPC::Cmd). Instead, use Joey's construction for
temporarily redirecting stderr to /dev/null. Much much simpler and
better. Works on my laptop with bozohttpd, now to test on the NetBSD
wiki.
TeX has configuration options that prevent unsafe things like shell
escapes and insecure file reads/writes. Turn all of them on.
teximg's regex-based blacklist does not suffice. For instance:
[[!teximg code="""
\catcode`\%=0
%input{/etc/passwd}
"""]]
Remove the blacklist, since the TeX configuration options seal off the
underlying mechanisms more safely, and the blacklist blocks other TeX
commands that can prove useful.
Although imagemagick handles even really large sizes sanely, using a page
file, doing so would just waste time and disk space, since the browser
can be told to resize it larger.
checkconfig can run more than once in a single ikiwiki run if setup is
building wrappers. That clobbered the origsub value for bestlink, leading
to infinite recursion
It's not "exact" since case munging has to be done, and I think
"simple" captures the optimisation better.</pedant>
With apologies to smcv, who probably has to rebuild his wiki now.
Let E be the number of dependencies per page of the form "A depends on B and
nothing else", let D be the number of other dependencies per page,
let P be the total number of pages, and let C be the number of changed
pages in a refresh.
This patch should speed up a refresh from O(E*C*P + D*C*P) to
O(C + E*P + D*C*P), assuming that hash lookups are O(1).
In practice, plugins like inline and map produce a lot of these very simple
dependencies, and my album plugin's combination of inline with a large
number of pages causes it to suffer particularly badly.
In testing on a wiki with about 7000 objects (3500 full pages, 3500
images), a full rebuild continued to take about 5:30, and a refresh
after touching about 350 pages and 350 images reduced from 5:30 to 1:30.
As with my previous optimizations, this change will result in downgrades not
working correctly until the wiki is rebuilt.
This is unnecessary and just slows us down (by a factor of 2, in the
pessimal case where every page has an inline with pagenames); it's also
not possible to optimize it into add_depends_exact calls.
Set rootpage to the non-l10n'd rootpage parameter if it is set,
else to the masterpage of the linking page.
Signed-off-by: intrigeri <intrigeri@boum.org>
The po plugin's injected bestlink must do something special when called by this
exact part of inline's code.
Signed-off-by: intrigeri <intrigeri@boum.org>
... else, the recentchanges page shows a link such as "sandbox.es". But,
clicking on it goes to the English (or negotiated language) version of the page.
It is better in this one case if the link goes direct to the translated version
of the page.
(cherry picked from commit 496e8523c6)
... else, the recentchanges page shows a link such as "sandbox.es". But,
clicking on it goes to the English (or negotiated language) version of the page.
It is better in this one case if the link goes direct to the translated version
of the page.
This should be more efficient than pagespec_match_list since it short-circuits
after the first match is found.
The other problem with using pagespec_match_list here is it may throw an
error if a bad or failing pagespec somehow got into the dependencies.
The new dependency handling works better (eliminates more duplicates) if
dependencies are split up. On the same wiki mentioned in the previous
commit, this saves about a second (i.e. 4%) on the same test.
On a large wiki you can spend a lot of time reading through large lists
of dependencies to see whether files need to be rebuilt (album, with its
one-page-per-photo arrangement, suffers particularly badly from this).
The dependency list is currently a single pagespec, but it's not used like
a normal pagespec - in practice, it's a list of pagespecs joined with the
"or" operator.
Accordingly, change it to be stored as a list of pagespecs. On a wiki
with many tagged photo albums, this reduces the time to refresh after
`touch tags/*.mdwn` from about 31 to 25 seconds.
Getting the benefit of this change on an existing wiki requires a rebuild.
can evaluate them, check them in the wrapper right off the bat.
This doesn't prevent the deadlock in web commits that need to cvs
add directories, but I'm committing so Joey can take a look if he
wants.
This is both faster, and propigates any error in processing the feedpages
pagespec out to display on the page. Which may have been why I didn't use
it before, but currently seems like a good thing to do, since it explains
why your feeds are empty..
If a page is taken from the underlay, and one of the specified languages
does not have po files in the underlay, it would create a broken link
to the translated version of the page for that language.
With this change, there's no broken link.
I think the N/A was not intended to be visible, but it can show up as the
percent translated to a language. This happens if the page is located in an
underlay, and not translated to the language in any other underlay.
Previously, [[!meta redir="foo"]] on bar, where bar/foo exists, would
depend on "foo" (which matches nothing, probably) rather than "bar/foo".
(cherry picked from commit f27ec09b72f886415e63fe394e18d9c3cb3913bf)
Previously, [[!img bar.jpg]] on foo, where foo/bar.jpg exists, would
get a dependency equivalent to "glob(bar.jpg)" (which might not match
anything), rather than the correct "glob(foo/bar.jpg)".
(cherry picked from commit 85b2ec49ecd12dd23e5c432933457a72744ce7cb)
During backlink calulation, all links are examined and broken links can
be detected for free, so store a list of broken links and have brokenlinks
use it.
Exposing the %brokenlinks structure is a bit ugly, but the speedup seems
worth it: Around 1 second for wikis the size of the doc wiki that use
brokenlinks.
By adding this setting, we get both more configurability, and a minor
optimisation too, since gettext does not need to be called continually
to get the Discussion value.