ikiwiki/doc/todo/darcs.mdwn

552 lines
18 KiB
Markdown

Here's Thomas Schwinge unfinished darcs support for ikiwiki.
(Finishing this has been suggested as a [[soc]] project.)
> I haven't been working on this for months and also won't in the near
> future. Feel free to use what I have done so
> far and bring it into an usable state! Also, feel free to contact me
> if there are questions.
-- [Thomas Schwinge](mailto:tschwinge@gnu.org)
[[!toggle text="show"]]
[[!toggleable text="""
# Support for the darcs rcs, <URL:http://darcs.net/>.
# Copyright (C) 2006 Thomas Schwinge <tschwinge@gnu.org>
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
# We're guaranteed to be the only instance of ikiwiki running at a given
# time. It is essential that only ikiwiki is working on a particular
# repository. That means one instance of ikiwiki and it also means that
# you must not `darcs push' into this repository, as this might create
# race conditions, as I understand it.
use warnings;
use strict;
use IkiWiki;
package IkiWiki;
# Which darcs executable to use.
my $darcs = ($ENV{DARCS} or 'darcs');
# Internal functions.
sub darcs_info ($$$) {
my $field = shift;
my $repodir = shift;
my $file = shift; # Relative to the repodir.
my $child = open(DARCS_CHANGES, "-|");
if (! $child) {
exec($darcs, 'changes', '--repo=' . $repodir, '--xml-output', $file) or
error('failed to run `darcs changes\'');
}
# Brute force for now. :-/
while (<DARCS_CHANGES>) {
last if /^<\/created_as>$/;
}
($_) = <DARCS_CHANGES> =~ /$field=\'([^\']+)/;
$field eq 'hash' and s/\.gz//; # Strip away the `.gz' from `hash'es.
close(DARCS_CHANGES) or error('`darcs changes\' exited ' . $?);
return $_;
}
# Exported functions.
sub rcs_update () {
# Not needed.
}
sub rcs_prepedit ($) {
# Prepares to edit a file under revision control. Returns a token that
# must be passed to rcs_commit() when the file is to be commited. For us,
# this token the hash value of the latest patch that modifies the file,
# i.e. something like its current revision. If the file is not yet added
# to the repository, we return TODO: the empty string.
my $file = shift; # Relative to the repodir.
my $hash = darcs_info('hash', $config{srcdir}, $file);
return defined $hash ? $hash : "";
}
sub rcs_commit ($$$) {
# Commit the page. Returns `undef' on success and a version of the page
# with conflict markers on failure.
my $file = shift; # Relative to the repodir.
my $message = shift;
my $rcstoken = shift;
# Compute if the ``revision'' of $file changed.
my $changed = darcs_info('hash', $config{srcdir}, $file) ne $rcstoken;
# Yes, the following is a bit convoluted.
if ($changed) {
# TODO. Invent a better, non-conflicting name.
rename("$config{srcdir}/$file", "$config{srcdir}/$file.save") or
error("failed to rename $file to $file.save: $!");
# Roll the repository back to $rcstoken.
# TODO. Can we be sure that no changes are lost? I think that
# we can, if we make sure that the `darcs push' below will always
# succeed.
# We need to revert everything as `darcs obliterate' might choke
# otherwise.
# TODO: `yes | ...' needed? Doesn't seem so.
system($darcs, "revert", "--repodir=" . $config{srcdir}, "--all") and
error("`darcs revert' failed");
# Remove all patches starting at $rcstoken.
# TODO. Something like `yes | darcs obliterate ...' seems to be needed.
system($darcs, "obliterate", "--quiet", "--repodir" . $config{srcdir},
"--match", "hash " . $rcstoken) and
error("`darcs obliterate' failed");
# Restore the $rcstoken one.
system($darcs, "pull", "--quiet", "--repodir=" . $config{srcdir},
"--match", "hash " . $rcstoken, "--all") and
error("`darcs pull' failed");
# We're back at $rcstoken. Re-install the modified file.
rename("$config{srcdir}/$file.save", "$config{srcdir}/$file") or
error("failed to rename $file.save to $file: $!");
}
# Record the changes.
# TODO: What if $message is empty?
writefile("$file.log", $config{srcdir}, $message);
system($darcs, 'record', '--repodir=' . $config{srcdir}, '--all',
'--logfile=' . "$config{srcdir}/$file.log",
'--author=' . 'web commit <web-hurd@gnu.org>', $file) and
error('`darcs record\' failed');
# Update the repository by pulling from the default repository, which is
# master repository.
system($darcs, "pull", "--quiet", "--repodir=" . $config{srcdir},
"--all") and error("`darcs pull' failed\n");
# If this updating yields any conflicts, we'll record them now to resolve
# them. If nothing is recorded, there are no conflicts.
$rcstoken = darcs_info('hash', $config{srcdir}, $file);
# TODO: Use only the first line here, i.e. only the patch name?
writefile("$file.log", $config{srcdir}, 'resolve conflicts: ' . $message);
system($darcs, 'record', '--repodir=' . $config{srcdir}, '--all',
'--logfile=' . "$config{srcdir}/$file.log",
'--author=' . 'web commit <web-hurd@gnu.org>', $file) and
error('`darcs record\' failed');
my $conflicts = darcs_info('hash', $config{srcdir}, $file) ne $rcstoken;
unlink("$config{srcdir}/$file.log") or
error("failed to remove `$file.log'");
# Push the changes to the main repository.
system($darcs, 'push', '--quiet', '--repodir=' . $config{srcdir}, '--all')
and error('`darcs push\' failed');
# TODO: darcs send?
if ($conflicts) {
my $document = readfile("$config{srcdir}/$file");
# Try to leave everything in a consistent state.
# TODO: `yes | ...' needed? Doesn't seem so.
system($darcs, "revert", "--repodir=" . $config{srcdir}, "--all") and
warn("`darcs revert' failed.\n");
return $document;
} else {
return undef;
}
}
sub rcs_add ($) {
my $file = shift; # Relative to the repodir.
# Intermediate directories will be added automagically.
system($darcs, 'add', '--quiet', '--repodir=' . $config{srcdir},
'--boring', $file) and error('`darcs add\' failed');
}
sub rcs_recentchanges ($) {
warn('rcs_recentchanges() is not implemented');
return 'rcs_recentchanges() is not implemented';
}
sub rcs_notify () {
warn('rcs_notify() is not implemented');
}
sub rcs_getctime () {
warn('rcs_getctime() is not implemented');
}
1
"""]]
This is my ([bma](bma@bmalee.eu)) darcs.pm - it's messy (my Perl isn't up to much) but seems to work. It uses just one repo, like the mercurial plugin (unlike the above version, which AIUI uses two).
`rcs_commit()` uses backticks instead of `system()`, to prevent darcs' output being sent to the browser and mucking with the HTTP headers (`darcs record` has no --quiet option). And `rcs_recentchanges()` uses regexes rather than parsing darcs' XML output.
[[!toggle text="show" id="bma"]]
[[!toggleable id="bma" text="""
#!/usr/bin/perl
use warnings;
use strict;
use IkiWiki;
use Date::Parse;
use open qw{:utf8 :std};
package IkiWiki;
sub rcs_update () {
# Do nothing - there's nowhere to update *from*.
}
sub rcs_prepedit ($) {
}
sub rcs_commit ($$$;$$) {
my ($file, $message, $rcstoken, $user, $ipaddr) = @_;
# $user should probably be a name and an email address, by darcs
# convention.
if (defined $user) {
$user = possibly_foolish_untaint($user);
}
elsif (defined $ipaddr) {
$user = "Anonymous from $ipaddr";
}
else {
$user = "Anonymous";
}
$message = possibly_foolish_untaint($message);
# BUG: this outputs one line of text, and there's not a -q or --quiet
# option. Redirecting output to /dev/null works, but I still get the
# HTTP status and location headers displayed in the browser - is that
# darcs' fault or ikiwiki's?
# Doing it in backticks *works*, but I'm sure it could be done better.
my @cmdline = ("darcs", "record", "--repodir", "$config{srcdir}",
"-a", "-m", "$message", "--author", "$user", $file);
`darcs record --repodir "$config{srcdir}" -a -m "$message" --author "$user" $file`; # Return value? Output? Who needs 'em?
#if (system(@cmdline) != 0) {
# warn "'@cmdline' failed: $!";
#}
return undef; # success
sub rcs_add ($) {
my ($file) = @_;
my @cmdline = ("darcs", "add", "--repodir", "$config{srcdir}", "-a", "-q", "$file");
if (system(@cmdline) != 0) {
warn "'@cmdline' failed: $!";
}
}
sub rcs_recentchanges ($) {
# TODO: This is horrible code. It doesn't work perfectly, and uses regexes
# rather than parsing Darcs' XML output.
my $num=shift;
my @ret;
return unless -d "$config{srcdir}/_darcs";
my $changelog = `darcs changes --xml --summary --repodir "$config{srcdir}"`;
$changelog = join("", split(/\s*\n\s*/, $changelog));
my @changes = split(/<\/patch>.*?<patch/m, $changelog);
foreach my $change (@changes) {
$change =~ m/hash='(.*?)'/;
my $rev = $1;
$change =~ m/author='(.*?)'/;
my $user = $1."\n";
my $committype = "web";
if($user =~ m/&lt;/) {
# Author fields generated by darcs include an email address: look for the "<".
$committype = "darcs";
use HTML::Entities;
$user = decode_entities $user;
}
$change =~ m/local_date='(.*?)'/;
my $when = $1;
$when=time - str2time($when, 'UTC');
$change =~ m/<name>(.*?)<\/name>/g;
my @message = {line => $1};
foreach my $match ($change =~ m/<comment>(.*?)<\/comment>/gm) {
push @message, {line => $1};
}
my @pages;
foreach my $match ($change =~ m/<.*?_(file|directory)>(.*?)(<(added|removed)_lines.*\/>)*<\/.*?_(file|directory)>/g) {
# My perl-fu is weak. I'm probably going about this all wrong, anyway.
push @pages, {page => pagename($match)} if ( -f $config{srcdir}."/".$match || -d $config{srcdir}."/".$match) and not $match =~ m/^$/;
}
push @ret, { rev => $rev,
user => $user,
committype => $committype,
when => $when,
message => [@message],
pages => [@pages],
}
}
return @ret;
}
sub rcs_notify () {
# TODO
}
sub rcs_getctime ($) {
error gettext("getctime not implemented");
}
1
"""]]
---
Well, here's my version too. It only does getctime -- using a real XML parser, instead of regexp ugliness -- and maybe recentchanges, but that may be bitrotted, or maybe I never finished it, as I only need the getctime. As for actual commits, I have previously voiced my opinion, that this should be done by the plugin generating a patch bundle, and forwarding it to darcs in some way (`darcs apply` or even email to another host, possibly moderated), instead of the hacky direct modification of a working copy. It could also be faster to getctime in a batch. Just reading in all the changes the first time they're needed, might not be a big improvement in many cases, but if we got a batch request from ikiwiki, we could keep reaing the changes until all the files in this batch request have been met. --[[tuomov]]
[[!toggle text="show" id="tuomov"]]
[[!toggleable id="tuomov" text="""
<pre>
#!/usr/bin/perl
# Stubs for no revision control.
use warnings;
use strict;
use IkiWiki;
package IkiWiki;
sub rcs_update () {
}
sub rcs_prepedit ($) {
return ""
}
sub rcs_commit ($$$) {
return undef # success
}
sub rcs_add ($) {
}
sub rcs_recentchanges ($) {
my $num=shift;
my @ret;
eval q{use Date::Parse};
eval q{use XML::Simple};
my $repodir=$config{srcdir};
if (-d "$config{srcdir}/_darcs") {
my $child = open(LOG, "-|");
if (! $child) {
exec("darcs", "changes", "--xml",
"--repodir", "$repodir",
"--last", "$num")
|| error("darcs changes failed to run");
}
my $data=<LOG>;
close LOG;
my $log = XMLin($data, ForceArray => 1);
foreach my $patch ($log->{patch}) {
my $date=$patch->{local_date};
my $hash=$patch->{hash};
my $when=concise(ago(time - str2time($date)));
my @pages;
my $child = open(SUMMARY, "-|");
if (! $child) {
exec("darcs", "annotate", "-s", "--xml",
"--match", "hash: $hash",
"--repodir", "$repodir")
|| error("darcs annotate failed to run");
}
my $data=<SUMMARY>;
close SUMMARY;
my $summary = XMLin("<lame>$data</lame>", ForceArray => 1);
# TODO: find @pages
push @ret, {
#rev => $rev,
user => $patch->{author},
#committype => $committype,
when => $when,
#message => [@message],
pages => [@pages],
}; # if @pages;
return @ret if @ret >= $num;
}
}
return @ret;
}
sub rcs_notify () {
}
sub rcs_getctime ($) {
my $file=shift;
eval q{use Date::Parse};
eval q{use XML::Simple};
local $/=undef;
# Sigh... doing things the hard way again
my $repodir=$config{srcdir};
my $filer=substr($file, length($repodir));
$filer =~ s:^[/]+::;
my $child = open(LOG, "-|");
if (! $child) {
exec("darcs", "changes", "--xml", "--reverse",
"--repodir", "$repodir", "$filer")
|| error("darcs changes $filer failed to run");
}
my $data=<LOG>;
close LOG;
my $log = XMLin($data, ForceArray => 1);
my $datestr=$log->{patch}[0]->{local_date};
if (! defined $datestr) {
warn "failed to get ctime for $filer";
return 0;
}
my $date=str2time($datestr);
debug("found ctime ".localtime($date)." for $file");
return $date;
}
1
</pre>
"""]]
---
I merged the two versions above and made some fixes; it is recording my web edits in darcs and showing a recent changes page.
It is in a [darcs repository](http://joyful.com/darcsweb/darcsweb.cgi?r=ikiwiki-darcs), please send patches. --[[Simon_Michael]]
> I'd like to see at least the following fixed before I commit this: --[[Joey]]
> * Running `darcs record $filename` in backticks is not good (security)
> The thing to do is to open stdout to /dev/null before execing darcs.
> * Get `rcs_recentchanges_xml` working, parsing xml with regexps does
> not seem like a maintenance win.
> * `rcs_notify` should be removed, it's no longer used.
> * Some form of conflict handling. Using darcs to attempt to merge
> the changes is I gusss optional (although every other rcs backend,
> including svn manages to do this), but it needs to at *least* detect
> conflicts and return a page with conflict markers for the user to fix
> the conflict.
I have addressed the recentchanges bit, you can find my hacked up darcs.pm at <http://web.mornfall.net/stuff/web-root/IkiWiki/Rcs/darcs.pm>.
It's got couple of FIXMEs, and a very site-specific filter for recentchanges. Not sure how to do that better though. I will eventually add web commits, probably of my own (and mention it here).
---
And here's yet another one, including an updated `ikiwiki-makerepo`. :)
<http://khjk.org/~pesco/ikiwiki-darcs/> (now a darcs repo)
> Note that there's a 'darcs' branch in git that I'm keeping a copy of your
> code in. Just in case. :-)
I've taken all the good stuff from the above and added the missing hooks. The code hasn't seen a lot of testing, so some bugs are likely yet to surface. Also, I'm not experienced with perl and don't know where I should have used the function `possibly_foolish_untaint`.
Regarding the repository layout: There are two darcs repositories. One is the `srcdir`, the other we'll call `master`.
* HTML is generated from `srcdir`.
* CGI edits happen in `srcdir`.
* The backend pulls updates from `master` into `srcdir`, i.e. darcs commits should happen to `master`.
* `master` calls ikiwiki (through a wrapper) in its apply posthook, i.e. `master/_darcs/prefs/defaults` should look like this:
apply posthook ikiwrap
apply run-posthook
(I'm not sure, should/could it be `ikiwrap --refresh` above?)
* The backend pushes CGI edits from `srcdir` back into `master` (triggering the apply hook).
* The working copies in `srcdir` and `master` should *not* be touched by the user, only by the CGI or darcs, respectively.
> Review of this one:
>
> * Should use tab indentation.
> * `rcs_getctime` should not need to use a ctime cache (such a cache should
> also not be named `.ikiwiki.ctimes`). `rcs_getctime` is run exactly
> once per page, ever, and the data is cached in ikiwiki's index.
> * I doubt that ENV{DARCS} will be available, since the wrapper clobbers> the entire
> environment. I'd say remove that.
> * I don't understand what `darcs_info` is doing, but it seems to be
> parsing xml with a regexp?
> * Looks like `rcs_commit` needs a few improvements, as marked TODO
> * `rcs_remove` just calls "rm"? Does darcs record notice the file was removed
> and automatically commit the removal? (And why `system("rm")` and not
> `unlink`?)
> * Is the the darcs info in [[details]] still up-to-date re this version?
> --[[Joey]]
> Update:
>
> I think I've addressed all of the above except for the XML parsing in `darcs_info`.
> The function determines the md5 hash of the last patch the given file appears in.
> That's indeed being done with regexps but my Perl isn't good enough for a quick recode
> right now.
>
> As for the darcs info in [[rcs/details]], it does not accurately describe the way
> this version works. It's similar, but the details differ slightly.
> You could copy my description above to replace it.
>
> There is still some ironing to do, for instance the current version doesn't allow for
> modifying attachments by re-uploading them via CGI ("darcs add failed"). Am I assuming
> correctly that "adding" a file that's already in the repo should just be a no-op?
> --pesco
>> It should result in the new file contents being committed by
>> `rcs_commit_staged`. For some revision control systems, which
>> automatically commit modifications, it would be a no-op. --[[Joey]]
>>> Done. --pesco
[[!tag patch]]