I also tried setting RPC::XML::ENCODING but that did not prevent the crash,
and it seems that blogspam.net doesn't like getting xml encoded in unicode,
since it mis-flagged comments as spammy that way that are normally allowed
through.
The rss spec says that unless the attribute is set, guid elements *are*
permalinks. The problem with that is that if [[meta permalink=]] is used,
as is done with aggregated posts, that goes into the link element, and
apparently some rss readers prefer the not-really-permalink in the guid
element when linking to the post.
Without meta permalink, the link and guid elements have the same content,
so it should be ok, in that case too for the guid to not be a permalink.
(Checked and this does not flood aggregators.)
The idea here is that <meta name="foo" description="bar">
can be written like [[!meta name="foo" description="bar">.
Of course, [[!meta foo=bar]] is still supported; this new feature
provides some DWIM when trying to directly convert a meta tag into
a meta directive.
template_depends was adding a dependency on the source filename,
instead of on the page name when a template is a page. Such a
dependency doesn't work.
Since misctemplate is called with a page context, the comments plugin
thinks it should add that, as well as the comment link in the actionbar.
I kept the comment link because a quick link back to the comments to a page
is sorta useful.
Cleanly fixed case where destdir file failed to be written because there
was a directory with the same name. This can be detected with no extra
system calls, and dealt with by finding all pages that wrote files
inside the directory, and removing them and the directory.
The other, inverse case would be expensive to detect in will_render,
since it would need to check each parent directory of the file to see
if the directory is really a conflicting file. But prep_writefile
already does a similar scan for symlinks in the path, so I added code
there to remove the conflicting file. This fix assumes that the file
is written using writefile, and not some other means (but using other means
would be a security hole too, so hopefully nothing does).
Renamed usershort => nickname.
Note that this means existing user login sessions will not have the nickname
recorded, and so it won't be used for those.
There was some confusion about whether the filename was
relative to srcdir or not. Some test cases, and the bzr
plugin assumed it was relative to the srcdir. Most everything else
assumed it was absolute.
Changed it to relative, for consistency with the rest
of the rcs_ functions.
Using named parameters for these is overdue. Passing the session in a
parameter instead of passing username and IP separately will later allow
storing other session info, like username or part of the email.
Note that these functions are not part of the exported API,
and the prototype change will catch (most) skew, so I am not changing
API versions. Any third-party plugins that call them will need updated
though.
Now the git plugin supports commits with author fields that look like:
Author: http://my.openid/ <me@web>
Then in recentchanges, the short username will be displayed, linking
to the openid.
Particularly useful for the horrible google openids, of course.
A short story:
Once there was a unicode string, let's call him Srcdir.
Along came a crufy old File::Find, who went through a tree and pasted each
of the leaves in turn onto Srcdir. But this 90's relic didn't decode the
leaves -- despite some of them using unicode! Poor Srcdir, with these
leaves stuck on him, tainted them with his nice unicode-ness. They didn't
look like leaves at all, but instead garbage.
In other words, perl's unicode support sucks mightily, and drives
us all to drink and bad storytelling. But we knew that..
So, srcdir is not normally flagged as unicode, because typically it's pure
ascii. And in that case, things work ok; File::Find finds filenames, which
are not yet decoded to unicode, and appends them to the srcdir, and then
decode_utf8 happily converts the whole thing.
But, if the srcdir does contain utf8 characters, that breaks. Or, if a Yaml
setup file is used, Yaml::Syck's implicitunicode sets the unicode flag of
*all* strings, even those containing only ascii. In either case, srcdir
has the unicode flag set; a non-decoded filename is appended, and
decode_utf8 sees the flag and does *nothing*. The result is that the
filename is not decoded, so looks valid and gets skipped.
File::Find only sticks the directory and filenames together in no_chdir
mode .. but we need that mode for security. In order to retain the
security, and avoid the problem, I made it not pass srcdir to File::Find.
Instead, chdir to the srcdir, and pass ".". Since "." is ascii, the problem
is avoided.
Note that it takes care to chdir back to the starting location. Because
the user may have specified relative paths and so staying in the srcdir
might break. A relative path could even be specifed for an underlay dir, so
it chdirs back after each.
The bug here was that disabling a plugin included thru goodstuff, like
htmlscrubber, caused it to be added to disable_plugins, and those plugins
were never loaded, so could not be re-enabled. Fix by allowing them to be
force loaded when appropriate. (Also that allows disabled plugins to still
record their setup options when dumping a setup file.)
* calendar: Shorten day names, and improve styling of month calendar.
* style.css: Reduced sidebar width back to 20ex from 30; the month calendar
will now fit in the smaller width, and 30 was feeling too large.
The key is using width: auto; overflow: auto; -- this allows the div(s) to the
left of the floating sidebar to be resized to fit next to it, and prevents
any clear: both from pushing the div down below the end of the sidebar.
Many thanks for the Hurd wiki's developers for originally figuring this out.
The edit page recently developed the same problem with its textarea, now
that a sidebar can appear on that page too. In editpage.tmpl I needed to
add a new div around the editcontent textarea, as the above styles cannot
be applied directly to textareas. The textarea's own width is reduced to
98% because at least in chromium this avoids it getting unnecessary
horizonatl scrollbars when a sidebar is displayed next to it.
http://bzed.de/posts/2010/05/new_css_for_bzed.de/
smcv: [10:59:01] is the logical thing you want a <div> whose meaning is "the bits the sidebar is allowed to accompany"?
bzed: [10:59:14] yeah
bzed: [10:59:58] then you could just ensure that this part is as high as the sidebar
smcv: [11:02:44] wrapping a <div> around the sidebar, content and comments seems like the way forward, then
The linktype check was being done on the relativised link target,
but %typedlinks uses the same link targets as %links, so that didn't work.
I think the bug only appeared when tagbase was not set.
This bugfix also let me factor out the common typedlink checking code.
To match calendars, which use local time. Particularly important at
the end of the month.
I checked the history, and there seemed no good rationalle for the
pagespecs to use gmtime.
Problem is that by the time rendering calls render_dependent, %pagesources
has had deleted files removed from it. So match_comment's lookup of
files in there to see if they had the _comment extension failed.
I had to introduce a hash that temporarily holds filenames of deleted pages
to fix this.
Note that unlike comment(), internal() had avoided this pitfall by being
defined to match both internal and non-internal pages.
* openid: Incorporated a fancy openid-selector signin form.
(http://code.google.com/p/openid-selector/)
* openid: Use "openid_identifier" as the form field, as required
by OpenID Authentication v2.0 spec.
test isinternal first, because match_glob with internal => 1 also returns
non-internal pages that match. This order should also be faster.
Remove test to see if pagesources is set. isinternal will not succeed if it
is not.
* comments: Comments pending moderation are now stored in the srcdir
alongside accepted comments, but with a `._comment_pending` extension.
* This allows easier byhand moderation, as the "_pending" need
only be stripped off and the comment be committed to version control.
* The `comment_pending()` pagespec can be used to match such unmoderated
comments, which makes it easy to add a feed of them, or a counter
indicating how many there are.
* Belatedly added a `comment()` pagespec.
Turns out that users with a modified page.tmpl need to modify it on
upgrade, at least to add the FORCEBASEURL (so edit preview works),
so there is no point in trying to retain compatability.
* Removed misc.tmpl. Now to theme ikiwiki, you only need to customise
a single template, page.tmpl.
* misc.tmpl will, however, still be read if a locally modified version
exists. This is to avoid forcing users to update page.tmpl right now.
This is a first pass, it avoids needing to change style.css
except where it refers to tag types.
This goes a bit off the rails at the pageheader with its nested header.
Semantically, there should be an article around the whole page
header, content, and footer. Just as there will be an article around a
whole comment or inlined page header, content, and footer.
But that will mean changing the css that currently refers to pageheader to
refer to the enclosing article instead.
* Ikiwiki can be configured to generate html5 instead of the default xhtml
1.0. The html5 output mode is experimental, not yet fully standards
compliant, and will be subject to rapid change.
Needed to handle the move of the .js files into ikiwiki/, but also this is
a longstanding bug.
Old pagemtime is not remembered in rebuild mode, and changing that would
need a lot of changes. So instead, loop on pagectime, which is remembered.
Change to remembering old pagesources info in rebuild mode. This seems safe
enough.
This is a slow implementation; it runs svn log once per file
still, rather than running svn log once on the whole srcdir.
I did it this way because in my experience, svn log, run on a directory,
does not always list every change to files inside that directory.
I don't know why, and I use svn as little as possible these days.
* Automatically run --gettime the first time ikiwiki is run on
a given srcdir.
* Optimise --gettime for git, so it's appropriatly screamingly
fast. (This could be done for other backends too.)
* However, --gettime for git no longer follows renames.
* Use above to fix up timestamps on docwiki, as well as ensure that
timestamps on basewiki files shipped in the deb are sane.
* Rename --getctime to --gettime. (The old name still works for
backwards compatability.)
* --gettime now also looks up last modification time.
* Add rcs_getmtime to plugin API; currently only implemented
for git.
* pagestats: Class parameter can be used to override default class for
custom styling.
* pagestats: Use style=list to get a list of tags, scaled by use like
in a tag cloud. This is useful to put in a sidebar.
* Rework example blog front page.
The meta title data set by comments needs to be encoded the same way that
meta encodes it. (NB The security implications of the missing encoding
are small.)
Note that meta's encoding of title, description, and guid data, and not
other data, is probably a special case that should be removed. Instead,
these values should be encoded when used. I have avoided doing so here
because that would mean forcing a wiki rebuild on upgrade to have the data
consitently encoded.
For a while, I was avoiding capitalizing ikiwiki at the beginning of a
sentence. I now think that's a bad idea (unless explicitly referring to
the `ikiwiki` command). Still, I don't go all the way and always cap it,
as a proper noun. That would make the logo look bad. ;)
I also tend to avoid capping it as IkiWiki, except when referring to the
perl internals, which do use that capitalization. (Too late to change
that.) However, it's also reasonable to do so in a WikiLink, as a nod to
historical camelcase wikis.
Colons are not allowed at the start of urls, because it can be interpreted
as a protocol, and allowing arbitrary protocols can be unsafe
(CVE-2008-0809). However, this check was too restrictive, not allowing
use of eg, "video.ogv?t=0:03:00/0:04:00" to seek to a given place in a
video, or "somecgi?foo=bar:baz" to pass parameters with colons.
It's still not allowed to have a filename with a colon in it (ie
"foo:bar.png") -- to link to such a file, a fully qualified url must be
used.
Since Firefox version 3, it's done aggressive caching of visited pages, and
does not, by default, check if the cached content is still valid when
reloading or revisiting a page. By default, Firefox seems to not re-contact
the web server at all. Compare with eg, Epiphany and Chromium, which appear
to always check, and get back a 304 when the page is unchanged.
This header makes Firefox do the right thing, at least for html files. It
still over-caches if css, javascript, images, etc, are changed.
The info is stored in the session database, not the user database.
There should be no reason to need it when a user is not logged in.
Also, hide the email field in the preferences page for openid users.
Note that the email and username are not yet actually used for anything.
The email will be useful for gravatar, while the username might be used
for a more pretty display of the openid.
* moderatedcomments: Added moderate_pagespec that can be used
to control which users or comment locations are moderated.
This can be used, just for example, to moderate http://myopenid.com/*
if you're getting a lot of spammers from one particular openid
provider (who should perhaps answer your emails about them),
while not moderating other users.
* moderatedcomments: The moderate_users setting is deprecated. Instead,
set moderate_pagespec to "!admin()" or "user(*)" instead.
This prevented comments containing some utf-8, including euro sign, from
being submitted. Since md5_hex is a C implementation, the string has to be
converted from perl's internal encoding to utf-8 when it is called. Some
utf-8 happened to work before, apparently by accident.
Note that this will change the checksums returned.
unique_comment_location is only used when posting comments, so the checksum
does not need to be stable there.
I only changed page_to_id for completeness; it is passed a comment page
name, and they can currently never contain utf-8.
In teximg, the bug could perhaps be triggered if the tex source contained
utf-8. If that happens, the checksum will change, and some extra work might
be performed on upgrade to rebuild the image.
This was not doable before, but when I added transitive dependency handling
in the big dependency rewrite, it became possible to include a comment
count when inlining.
This also improves the action link when a page has no comments. It will
link direct to the cgi to allow posting the first comment. And if the page
is locked to prevent posting new comments, the link is no longer shown.
Now that openiduser is in IkiWiki core, it's ok to have passwordauth check
for it, and avoid displaying useless password fields when showing
preferences for an openid.
Also improved the styling of the display of the openid in the preferneces
page.
Consider a template like:
[[!template type=note text="""
[[!inline pages="*foo*"]]
"""]]
The text parameter is htmlized before being passed into the template (in
case the template wraps it in a <span> that prevents markdown from
htmlizing it later).
But, when markdown sees "*foo*", it turns that into <em>foo</em>.
Later, when preprocessing the inline directive, that leads to suprising
results.
To fix this, I made template parameters be preprocessed (and filtered)
before being htmlized.
Note that I left in the preprocessing (and filtering) of the template
output at the end. That's still relevant when the template itself contains
preprocessor directives.
Note that there is an associated po4a warning when a page is empty:
Use of uninitialized value $file in substitution (s///) at /usr/share/perl5/Locale/Po4a/Text.pm line 205.
I've filed a bug with po4a about that, but the important thing is fixing
the crash here.
The new git-notes feature in git 1.6.6 changes git log output in a way that
broke ikiwiki's parser if notes are added to commits.
I decided to deal with this by disabling notes when ikiwiki uses git,
by setting GIT_NOTES_REF="". AFAICS, looking up notes when dumping logs
will only waste time, since it does not currently seem to make sense for
ikiwiki to do anything with the notes.
I noticed that chromium was not hyperlinking the areas in the object-based
linkmap, while img works ok. Dunno why, but img based is nicer anyway since
it is allowed right through the htmlscrubber with no workarounds.
On second though, you might want a wide-open wiki with some locked
pages that cannot be edited online.
So, the right thing for lockedit to do when there are no auth plugins is
to just say the page cannot be edited.
Problem here was that no charset http header was being sent.
I fixed this globally by making cgi_custom_failure send the header.
Required changing its parameters.
The crux of the problem is that the cgi object has raw values not converted
to utf-8, and rename was using its fields. Also fixed a missed place where
the form object did not get its fields utf-8 encoded.
I noticed the onload hook running twice sometimes when using chromium.
Change from using arguments.callee.done to a onload_done variable fixed it.
I guess that the callee differed in chromium.
Probably the cause of the problem is that chrome supports both
window.onload and document.addEventListener.
Speedup of about 25% for small inlines; could be much larger for inlines of
many, or complex pages.
Not bloating memory with excessive memoization data was the key to this.
The method chosen does not squeeze out every erg of speed possible when
inlines are nested, but that's rare. It uses less memory than other
optimisation hacks (I'm looking at you,
f937c1fb80 !) already used in inline.pm.
Unlike generic meta foo tags, meta description is known to be safe, so can
be special cased to be allowed despite the html scrubber. This makes meta
description much more useful, since it is otherwise limited to being used
by other plugins like map.
My experience is that when inlines are nested, the old behavior of
generating feeds for the nested inlines was never really desired. Since the
feeds were numbered sequentially, the numbers could easily change, and it did
not make sense to subscribe to or use those feeds. And generating those nested
feeds often meant a lot of unnecessary calculation, and data being written.
So, I dropped them.
Looking back, nested feeds originally were a free side effect of properly
handing multiple feeds on one page. Of course, that is still supported.
I chose not to have it override style.css, because style.css is not really
intended to be edited; the one from the underlay is intended to be used as
a base that local.css overrides.
I chose to use a plugin rather than changing the default behavior, both
because I didn't want to have to worry about possibly breaking backwards
compatability (though this seems unlikely), and because it seemed cleaner
to not include style template parameters in the main page template code.
I suppose someone might want a way to not override the toplevel
local.css, but instead include it as well as foo/local.css. Probably the
best way to do that would be to have foo/local.css @import ../local.css
(modulo browser compatability issues). Alternatively, edit page.tmpl
to always include the toplevel local.css, or swap out this plugin for
another one.
When redirecting to a page, ie, after editing, ensure that the url is
uri-encoded. Most browsers other than MSIE don't care, but it's the right
thing to do.
The known failure case involved editing a page that had utf-8 in the name
using MSIE.
Before, the htmllink would display the link to the template as if it were a
wikilink, but what was stored was not, which could lead to confusing
situations.
This will be a bit more expensive, but --getctime does not need to be fast.
And getting the real creation time a very useful when untangling blog
histories that involve renames.