The simple implementation of this, which I'd prefer to use, would be:
if we can import LWPx::ParanoidAgent, use it; otherwise, use
LWP::UserAgent.
However, aggregate has historically worked with proxies, and
LWPx::ParanoidAgent quite reasonably refuses to work with proxies
(because it can't know whether those proxies are going to do the same
filtering that LWPx::ParanoidAgent would).
Signed-off-by: Simon McVittie <smcv@debian.org>
This prevents the aggregate plugin from being used to read the contents
of local files via file:/// URLs.
Signed-off-by: Simon McVittie <smcv@debian.org>
I'm not sure what happened here, but it seems to have been the wrong
thing. Whitelist what we want to *not* encode instead.
Signed-off-by: Simon McVittie <smcv@debian.org>
This is a minimal version of what we should in principle do here,
which is to escape the error message in whatever way is correct for
embedding plain text in the surrounding wiki markup language.
This implementation approximates that by assuming that HTML entities,
alphanumerics and common punctuation characters are passed through the
markup language unaltered, but punctuation characters might be
misinterpreted.
Signed-off-by: Simon McVittie <smcv@debian.org>
* emailauth: Fix cookie problem when user is on https and the cgiurl
uses http, by making the emailed login link use https.
* passwordauth: Use https for emailed password reset link when user
is on https.
Not entirely happy with this approach, but I don't currently see a
better one.
I have not verified that the passwordauth change fixes any problem,
other than the user getting a http link when they were using https.
The emailauth problem is verified fixed by this commit.
This commit was sponsored by Michael Magin.
The instance in cgierror() is a potential cross-site scripting attack,
because an attacker could conceivably cause some module to raise an
exception that includes attacker-supplied HTML in its message, for
example via a crafted filename. (OVE-20160505-0012)
The instances in preprocess() is just correctness. It is not a
cross-site scripting attack, because an attacker could equally well
write the desired HTML themselves; the sanitize hook is what
protects us from cross-site scripting here.
This avoids nasty surprises on upgrade if a site is using httpauth,
or passwordauth with an account_creation_password, and relying on
only a select group of users being able to edit the site. We can revisit
this for ikiwiki 4.
IkiWiki::cgiurl() currently produces non-deterministic output, because
the params hash can be sorted different ways.
Sorting keys to params before crafting the string should make the
output deterministic.
This was needed due to emailauth, but I've also wrapped all IP address
exposure in cloak(), although the function doesn't yet cloak IP addresses.
(One IP address I didn't cloak is the one that appears on the password
reset email template. That is expected to be the user's own IP address,
so ok to show it to them.)
Thanks to smcv for the pointer to
http://xmlns.com/foaf/spec/#term_mbox_sha1sum
Mobile browsers typically assume that arbitrary web pages are
designed for a "desktop-sized" browser window (around 1000px)
and display that layout, zoomed out, in order to avoid breaking
naive designs that assume nobody will ever look at a website on
a phone or something. People who are actually doing "responsive
design" need to opt-in to mobile browsers rendering it at a
more normal size.
According to caniuse.com, a significant fraction of Web users are
still using Internet Explorer versions that do not support HTML5
sectioning elements. However, claiming we're XHTML 1.0 Strict
means we can't use features invented in the last 12 years, even if
they degrade gracefully in older browsers (like the role and placeholder
attributes).
This means our output is no longer valid according to any particular
DTD. Real browsers and other non-validator user-agents have never
cared about DTD compliance anyway, so I don't think this is a real loss.
It appears that both the open-source and proprietary rulesets for
ModSecurity default to blacklisting requests that say they are
from libwww-perl, presumably because some script kiddies use libwww-perl
and are too inept to set a User-Agent that is "too big to blacklist",
like Chrome or the iPhone browser or something. This seems doomed to
failure but whatever.
pagespec_match_list() makes the current page depend on the pagespec
being matched, so if you use [[!trailoptions sort="..."]] to force
a sort order, the trail ends up depending on internal(*) and is
rebuilt whenever anything changes. Add a new sort_pages() and use that
instead.
In the scan phase, it's too early to match pagespecs or sort pages;
in the render phase, both of those are OK.
It would be possible to add phases later, renumbering them if necessary
to maintain numerical order.
Whenever I look at dependency calculation, it takes me a while to get my
head round the concept of influences. If what I've written here is
accurate, maybe the next person to look at this (or my future self)
will need less of a run-up.
Package: ikiwiki
Version: 3.20140125
Severity: wishlist
By default, LWP::UserAgent used by IkiWiki to perform outbound HTTP
requests sends the string "libwww-perl/<version number>" as User-Agent
header in HTTP requests. Some blogging platforms have blacklisted the
user agent and won't serve any content for clients using this user agent
string. With IkiWiki configuration option "useragent" it's now possible
to define a custom string that is used for the value of the User-Agent
header.
I have benchmarked the pagename() call this avoids taking up to 2 seconds
for a loadindex in a large wiki. The total loadindex for that wiki was
6.46s, so this is a significant improvment.
Even on a smaller site, this reduces the refresh time from 1.69 to 1.52
seconds.
The only breakage risk here is that pagename() can change the page name
it calculates due to setup changes. But in the case of a setup change, the
whole site is rebuilt. So the cached page name is not used in that
case.
I *think* this is ok, at least it results in close to the same index being
saved as before. The difference is that plugins that have a pagestate of {}
have that recorded this way, while with the tight loop, the key for the
plugin in not copied in that case. I cannot see how this could matter.