The meta title data set by comments needs to be encoded the same way that
meta encodes it. (NB The security implications of the missing encoding
are small.)
Note that meta's encoding of title, description, and guid data, and not
other data, is probably a special case that should be removed. Instead,
these values should be encoded when used. I have avoided doing so here
because that would mean forcing a wiki rebuild on upgrade to have the data
consitently encoded.
This prevented comments containing some utf-8, including euro sign, from
being submitted. Since md5_hex is a C implementation, the string has to be
converted from perl's internal encoding to utf-8 when it is called. Some
utf-8 happened to work before, apparently by accident.
Note that this will change the checksums returned.
unique_comment_location is only used when posting comments, so the checksum
does not need to be stable there.
I only changed page_to_id for completeness; it is passed a comment page
name, and they can currently never contain utf-8.
In teximg, the bug could perhaps be triggered if the tex source contained
utf-8. If that happens, the checksum will change, and some extra work might
be performed on upgrade to rebuild the image.
This was not doable before, but when I added transitive dependency handling
in the big dependency rewrite, it became possible to include a comment
count when inlining.
This also improves the action link when a page has no comments. It will
link direct to the cgi to allow posting the first comment. And if the page
is locked to prevent posting new comments, the link is no longer shown.
Rationalle: Comments need to be user-editable so that they can be posted
via git commit etc.
The _comment directive is still supported, for back-compat.
The munged ids were looking pretty nasty, and were not completly guaranteed
to be unique. So a md5sum seems like a better approach. (Would have used
sha1, but md5 is in perl core.)
Well, that was a PITA.
Luckily, this doesn't break guids to comments in rss feeds,
though it does change the links.
I haven't put in a warning about needing to rebuild to get
this fix. It's probably good enough for new comments to get the
fix, without a lot of mass rebuilding.
It was calling format hooks for each comment on the page.
When relativedate is enabled, that made it insert <script> tags
for each comment. And the browser loaded the same script over and over,
which was slow on its own. But that was nothing compared to running
the onload even over and over.. especially since the hook system
added a new call to the hook each time it loaded.
For a page with 10 comments, that caused the relativedate DOM parsing
code to run 1000 times, I think. Anyway, it was sloow. Now it runs once.
People seem to be able to expect to enter www.foo.com and get away with it.
The resulting my.wiki/www.foo.com link was not ideal.
To fix it, use URI::Heuristic to expand such things into a real url. It
even looks up hostnames in the DNS if necessary.
None of the comment state needs to be stored through the a later run of
ikiwiki, so move it all from pagestate to a more transient storage.
This is assuming that we'll never want to add pagespecs to search against
the comment state. Pagespecs like author() are why the meta plugin does
store its meta data in pagestate -- the data can be needed later to match
against.