update docs for darcs

Deleted all the old incomplete implementations.

Moved explanation of the two-repo system currently implemented for darcs
into rcs/details, and removed discussion from there about other methods
(including one-repo).

Wrote a rcs/darcs page, which I hope is accurate.
master
Joey Hess 2009-04-04 18:04:20 -04:00
parent c93360a751
commit ca5704936d
3 changed files with 29 additions and 594 deletions

15
doc/rcs/darcs.mdwn 100644
View File

@ -0,0 +1,15 @@
[Darcs](http://darcs.new) is a distributed revison control
system. Ikiwiki supports storing a wiki in a
Darcs repository.
An Ikiwiki wrapper is run by the `posthook` to update a wiki whenever commits
or remote pushes come in. When running as a [[cgi]] with Darcs, ikiwiki
automatically commits edited pages, and uses the Darcs history to generate the
[[RecentChanges]] page.
Example for a `_darcs/prefs/defaults` file in `$SRCDIR`:
apply posthook /path/to/repository/_darcs/ikiwrapper
apply run-posthook
See also [[todo/darcs|todo/darcs]]

View File

@ -32,98 +32,20 @@ You browse and web-edit the wiki on W.
W "belongs" to ikiwiki and should not be edited directly. W "belongs" to ikiwiki and should not be edited directly.
## [darcs](http://darcs.net/) (not yet included) ## [[darcs]]
Support for using darcs as a backend is being worked on by [Thomas Regarding the repository layout: There are two darcs repositories. One is the `srcdir`, the other we'll call `master`.
Schwinge](mailto:tschwinge@gnu.org), although development is on hold curretly.
There is a patch in [[todo/darcs]].
### How will it work internally? * HTML is generated from `srcdir`.
* CGI edits happen in `srcdir`.
* The backend pulls updates from `master` into `srcdir`, i.e. darcs commits should happen to `master`.
* `master` calls ikiwiki (through a wrapper) in its apply posthook, i.e. `master/_darcs/prefs/defaults` should look like this:
``Master'' repository R1. apply posthook ikiwrap
apply run-posthook
RCS commits from the outside are installed into R1.
HTML is generated from R1. HTML is automatically generated (by using a
``post-hook'') each time a new change is installed into R1. It follows
that rcs_update() is not needed.
There is a working copy of R1: R2.
CGI operates on R2. rcs_commit() will push from R2 to R1.
You browse the wiki on R1 and web-edit it on R2. This means for example
that R2 needs to be updated from R1 if you are going to web-edit a page,
as the user otherwise might be irritated otherwise...
How do changes get from R1 to R2? Currently only internally in
rcs\_commit(). Is rcs\_prepedit() suitable?
It follows that the HTML rendering and the CGI handling can be completely
separated parts in ikiwiki.
What repository should [[RecentChanges]] and History work on? R1?
#### Rationale for doing it differently than in the Subversion case
darcs is a distributed RCS, which means that every checkout of a
repository is equal to the repository it was checked-out from. There is
no forced hierarchy.
R1 is nevertheless called the master repository. It's used for
collecting all the changes and publishing them: on the one hand via the
rendered HTML and on the other via the standard darcs RCS interface.
R2, the repository the CGI operates on, is just a checkout of R1 and
doesn't really differ from the other checkouts that people will branch
off from R1.
(To be continued.)
#### Another possible approach
Here's what I (tuomov) think, would be a “cleaner” approach:
1. Upon starting to edit, Ikiwiki gets a copy of the page, and `darcs changes --context`.
This context _and_ the present version of the page are stored in as the “version” of the
page in a hidden control of the HTML.
Thus the HTML includes all that is needed to generate a patch wrt. to the state of the
repository at the time the edit was started. This is of course all that darcs needs.
2. Once the user is done with editing, _Ikiwiki generates a patch bundle_ for darcs.
This should be easy with existing `Text::Diff` or somesuch modules, as the Web edits
only concern single files. The reason why the old version of the page is stored in
the HTML (possibly compressed) is that the diff can be generated.
3. Now this patch bundle is applied with `darcs apply`, or sent by email for moderation…
there are many possibilities.
This approach avoids some of the problems of concurrent edits that the previous one may have,
although there may be conflicts, which may or may not propagate to the displayed web page.
(Unfortunately there is not an option to `darcs apply` to generate some sort of confliction resolution
bundle.) Also, only one repository is needed, as it is never directly modified
by Ikiwiki.
This approach might be applicable to other distributed VCSs as well, although they're not as oriented
towards transmitting changes with standalone patch bundles (often by email) as darcs is.
> The mercurial plugin seems to just use one repo and edit it directly - is
> there some reason that's okay there but not for darcs? I agree with tuomov
> that having just the one repo would be preferable; the point of a dvcs is
> that there's no difference between one repo and another. I've got a
> darcs.pm based on mercurial.pm, that's almost usable... --bma
>> IMHO it comes down to whatever works well for a given RCS. Seems like
>> the darcs approach _could_ be done with most any distributed system, but
>> it might be overkill for some (or all?) While there is the incomplete darcs
>> plugin in [[todo/darcs]], if you submit one that's complete, I will
>> probably accept it into ikiwiki.. --[[Joey]]
>>> I'd like to help make a robust darcs (2) backend. I also think ikiwiki should use
>>> exactly one darcs repo. I think we can simplify and say conflicting web
>>> edits are not allowed, like most current wiki engines. I don't see that
>>> saving (so much) context in the html is necessary, then.
>>> bma, I would like to see your code. --[[Simon_Michael]]
>>> PS ah, there it is. Let's continue on the [[todo/darcs]] page.
* The backend pushes CGI edits from `srcdir` back into `master` (triggering the apply hook).
* The working copies in `srcdir` and `master` should *not* be touched by the user, only by the CGI or darcs, respectively.
## [[Git]] ## [[Git]]

View File

@ -1,513 +1,9 @@
Here's Thomas Schwinge unfinished darcs support for ikiwiki.
(Finishing this has been suggested as a [[soc]] project.)
> I haven't been working on this for months and also won't in the near
> future. Feel free to use what I have done so
> far and bring it into an usable state! Also, feel free to contact me
> if there are questions.
-- [Thomas Schwinge](mailto:tschwinge@gnu.org)
[[!toggle text="show"]]
[[!toggleable text="""
# Support for the darcs rcs, <URL:http://darcs.net/>.
# Copyright (C) 2006 Thomas Schwinge <tschwinge@gnu.org>
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
# We're guaranteed to be the only instance of ikiwiki running at a given
# time. It is essential that only ikiwiki is working on a particular
# repository. That means one instance of ikiwiki and it also means that
# you must not `darcs push' into this repository, as this might create
# race conditions, as I understand it.
use warnings;
use strict;
use IkiWiki;
package IkiWiki;
# Which darcs executable to use.
my $darcs = ($ENV{DARCS} or 'darcs');
# Internal functions.
sub darcs_info ($$$) {
my $field = shift;
my $repodir = shift;
my $file = shift; # Relative to the repodir.
my $child = open(DARCS_CHANGES, "-|");
if (! $child) {
exec($darcs, 'changes', '--repo=' . $repodir, '--xml-output', $file) or
error('failed to run `darcs changes\'');
}
# Brute force for now. :-/
while (<DARCS_CHANGES>) {
last if /^<\/created_as>$/;
}
($_) = <DARCS_CHANGES> =~ /$field=\'([^\']+)/;
$field eq 'hash' and s/\.gz//; # Strip away the `.gz' from `hash'es.
close(DARCS_CHANGES) or error('`darcs changes\' exited ' . $?);
return $_;
}
# Exported functions.
sub rcs_update () {
# Not needed.
}
sub rcs_prepedit ($) {
# Prepares to edit a file under revision control. Returns a token that
# must be passed to rcs_commit() when the file is to be commited. For us,
# this token the hash value of the latest patch that modifies the file,
# i.e. something like its current revision. If the file is not yet added
# to the repository, we return TODO: the empty string.
my $file = shift; # Relative to the repodir.
my $hash = darcs_info('hash', $config{srcdir}, $file);
return defined $hash ? $hash : "";
}
sub rcs_commit ($$$) {
# Commit the page. Returns `undef' on success and a version of the page
# with conflict markers on failure.
my $file = shift; # Relative to the repodir.
my $message = shift;
my $rcstoken = shift;
# Compute if the ``revision'' of $file changed.
my $changed = darcs_info('hash', $config{srcdir}, $file) ne $rcstoken;
# Yes, the following is a bit convoluted.
if ($changed) {
# TODO. Invent a better, non-conflicting name.
rename("$config{srcdir}/$file", "$config{srcdir}/$file.save") or
error("failed to rename $file to $file.save: $!");
# Roll the repository back to $rcstoken.
# TODO. Can we be sure that no changes are lost? I think that
# we can, if we make sure that the `darcs push' below will always
# succeed.
# We need to revert everything as `darcs obliterate' might choke
# otherwise.
# TODO: `yes | ...' needed? Doesn't seem so.
system($darcs, "revert", "--repodir=" . $config{srcdir}, "--all") and
error("`darcs revert' failed");
# Remove all patches starting at $rcstoken.
# TODO. Something like `yes | darcs obliterate ...' seems to be needed.
system($darcs, "obliterate", "--quiet", "--repodir" . $config{srcdir},
"--match", "hash " . $rcstoken) and
error("`darcs obliterate' failed");
# Restore the $rcstoken one.
system($darcs, "pull", "--quiet", "--repodir=" . $config{srcdir},
"--match", "hash " . $rcstoken, "--all") and
error("`darcs pull' failed");
# We're back at $rcstoken. Re-install the modified file.
rename("$config{srcdir}/$file.save", "$config{srcdir}/$file") or
error("failed to rename $file.save to $file: $!");
}
# Record the changes.
# TODO: What if $message is empty?
writefile("$file.log", $config{srcdir}, $message);
system($darcs, 'record', '--repodir=' . $config{srcdir}, '--all',
'--logfile=' . "$config{srcdir}/$file.log",
'--author=' . 'web commit <web-hurd@gnu.org>', $file) and
error('`darcs record\' failed');
# Update the repository by pulling from the default repository, which is
# master repository.
system($darcs, "pull", "--quiet", "--repodir=" . $config{srcdir},
"--all") and error("`darcs pull' failed\n");
# If this updating yields any conflicts, we'll record them now to resolve
# them. If nothing is recorded, there are no conflicts.
$rcstoken = darcs_info('hash', $config{srcdir}, $file);
# TODO: Use only the first line here, i.e. only the patch name?
writefile("$file.log", $config{srcdir}, 'resolve conflicts: ' . $message);
system($darcs, 'record', '--repodir=' . $config{srcdir}, '--all',
'--logfile=' . "$config{srcdir}/$file.log",
'--author=' . 'web commit <web-hurd@gnu.org>', $file) and
error('`darcs record\' failed');
my $conflicts = darcs_info('hash', $config{srcdir}, $file) ne $rcstoken;
unlink("$config{srcdir}/$file.log") or
error("failed to remove `$file.log'");
# Push the changes to the main repository.
system($darcs, 'push', '--quiet', '--repodir=' . $config{srcdir}, '--all')
and error('`darcs push\' failed');
# TODO: darcs send?
if ($conflicts) {
my $document = readfile("$config{srcdir}/$file");
# Try to leave everything in a consistent state.
# TODO: `yes | ...' needed? Doesn't seem so.
system($darcs, "revert", "--repodir=" . $config{srcdir}, "--all") and
warn("`darcs revert' failed.\n");
return $document;
} else {
return undef;
}
}
sub rcs_add ($) {
my $file = shift; # Relative to the repodir.
# Intermediate directories will be added automagically.
system($darcs, 'add', '--quiet', '--repodir=' . $config{srcdir},
'--boring', $file) and error('`darcs add\' failed');
}
sub rcs_recentchanges ($) {
warn('rcs_recentchanges() is not implemented');
return 'rcs_recentchanges() is not implemented';
}
sub rcs_notify () {
warn('rcs_notify() is not implemented');
}
sub rcs_getctime () {
warn('rcs_getctime() is not implemented');
}
1
"""]]
This is my ([bma](bma@bmalee.eu)) darcs.pm - it's messy (my Perl isn't up to much) but seems to work. It uses just one repo, like the mercurial plugin (unlike the above version, which AIUI uses two).
`rcs_commit()` uses backticks instead of `system()`, to prevent darcs' output being sent to the browser and mucking with the HTTP headers (`darcs record` has no --quiet option). And `rcs_recentchanges()` uses regexes rather than parsing darcs' XML output.
[[!toggle text="show" id="bma"]]
[[!toggleable id="bma" text="""
#!/usr/bin/perl
use warnings;
use strict;
use IkiWiki;
use Date::Parse;
use open qw{:utf8 :std};
package IkiWiki;
sub rcs_update () {
# Do nothing - there's nowhere to update *from*.
}
sub rcs_prepedit ($) {
}
sub rcs_commit ($$$;$$) {
my ($file, $message, $rcstoken, $user, $ipaddr) = @_;
# $user should probably be a name and an email address, by darcs
# convention.
if (defined $user) {
$user = possibly_foolish_untaint($user);
}
elsif (defined $ipaddr) {
$user = "Anonymous from $ipaddr";
}
else {
$user = "Anonymous";
}
$message = possibly_foolish_untaint($message);
# BUG: this outputs one line of text, and there's not a -q or --quiet
# option. Redirecting output to /dev/null works, but I still get the
# HTTP status and location headers displayed in the browser - is that
# darcs' fault or ikiwiki's?
# Doing it in backticks *works*, but I'm sure it could be done better.
my @cmdline = ("darcs", "record", "--repodir", "$config{srcdir}",
"-a", "-m", "$message", "--author", "$user", $file);
`darcs record --repodir "$config{srcdir}" -a -m "$message" --author "$user" $file`; # Return value? Output? Who needs 'em?
#if (system(@cmdline) != 0) {
# warn "'@cmdline' failed: $!";
#}
return undef; # success
sub rcs_add ($) {
my ($file) = @_;
my @cmdline = ("darcs", "add", "--repodir", "$config{srcdir}", "-a", "-q", "$file");
if (system(@cmdline) != 0) {
warn "'@cmdline' failed: $!";
}
}
sub rcs_recentchanges ($) {
# TODO: This is horrible code. It doesn't work perfectly, and uses regexes
# rather than parsing Darcs' XML output.
my $num=shift;
my @ret;
return unless -d "$config{srcdir}/_darcs";
my $changelog = `darcs changes --xml --summary --repodir "$config{srcdir}"`;
$changelog = join("", split(/\s*\n\s*/, $changelog));
my @changes = split(/<\/patch>.*?<patch/m, $changelog);
foreach my $change (@changes) {
$change =~ m/hash='(.*?)'/;
my $rev = $1;
$change =~ m/author='(.*?)'/;
my $user = $1."\n";
my $committype = "web";
if($user =~ m/&lt;/) {
# Author fields generated by darcs include an email address: look for the "<".
$committype = "darcs";
use HTML::Entities;
$user = decode_entities $user;
}
$change =~ m/local_date='(.*?)'/;
my $when = $1;
$when=time - str2time($when, 'UTC');
$change =~ m/<name>(.*?)<\/name>/g;
my @message = {line => $1};
foreach my $match ($change =~ m/<comment>(.*?)<\/comment>/gm) {
push @message, {line => $1};
}
my @pages;
foreach my $match ($change =~ m/<.*?_(file|directory)>(.*?)(<(added|removed)_lines.*\/>)*<\/.*?_(file|directory)>/g) {
# My perl-fu is weak. I'm probably going about this all wrong, anyway.
push @pages, {page => pagename($match)} if ( -f $config{srcdir}."/".$match || -d $config{srcdir}."/".$match) and not $match =~ m/^$/;
}
push @ret, { rev => $rev,
user => $user,
committype => $committype,
when => $when,
message => [@message],
pages => [@pages],
}
}
return @ret;
}
sub rcs_notify () {
# TODO
}
sub rcs_getctime ($) {
error gettext("getctime not implemented");
}
1
"""]]
---
Well, here's my version too. It only does getctime -- using a real XML parser, instead of regexp ugliness -- and maybe recentchanges, but that may be bitrotted, or maybe I never finished it, as I only need the getctime. As for actual commits, I have previously voiced my opinion, that this should be done by the plugin generating a patch bundle, and forwarding it to darcs in some way (`darcs apply` or even email to another host, possibly moderated), instead of the hacky direct modification of a working copy. It could also be faster to getctime in a batch. Just reading in all the changes the first time they're needed, might not be a big improvement in many cases, but if we got a batch request from ikiwiki, we could keep reaing the changes until all the files in this batch request have been met. --[[tuomov]]
[[!toggle text="show" id="tuomov"]]
[[!toggleable id="tuomov" text="""
<pre>
#!/usr/bin/perl
# Stubs for no revision control.
use warnings;
use strict;
use IkiWiki;
package IkiWiki;
sub rcs_update () {
}
sub rcs_prepedit ($) {
return ""
}
sub rcs_commit ($$$) {
return undef # success
}
sub rcs_add ($) {
}
sub rcs_recentchanges ($) {
my $num=shift;
my @ret;
eval q{use Date::Parse};
eval q{use XML::Simple};
my $repodir=$config{srcdir};
if (-d "$config{srcdir}/_darcs") {
my $child = open(LOG, "-|");
if (! $child) {
exec("darcs", "changes", "--xml",
"--repodir", "$repodir",
"--last", "$num")
|| error("darcs changes failed to run");
}
my $data=<LOG>;
close LOG;
my $log = XMLin($data, ForceArray => 1);
foreach my $patch ($log->{patch}) {
my $date=$patch->{local_date};
my $hash=$patch->{hash};
my $when=concise(ago(time - str2time($date)));
my @pages;
my $child = open(SUMMARY, "-|");
if (! $child) {
exec("darcs", "annotate", "-s", "--xml",
"--match", "hash: $hash",
"--repodir", "$repodir")
|| error("darcs annotate failed to run");
}
my $data=<SUMMARY>;
close SUMMARY;
my $summary = XMLin("<lame>$data</lame>", ForceArray => 1);
# TODO: find @pages
push @ret, {
#rev => $rev,
user => $patch->{author},
#committype => $committype,
when => $when,
#message => [@message],
pages => [@pages],
}; # if @pages;
return @ret if @ret >= $num;
}
}
return @ret;
}
sub rcs_notify () {
}
sub rcs_getctime ($) {
my $file=shift;
eval q{use Date::Parse};
eval q{use XML::Simple};
local $/=undef;
# Sigh... doing things the hard way again
my $repodir=$config{srcdir};
my $filer=substr($file, length($repodir));
$filer =~ s:^[/]+::;
my $child = open(LOG, "-|");
if (! $child) {
exec("darcs", "changes", "--xml", "--reverse",
"--repodir", "$repodir", "$filer")
|| error("darcs changes $filer failed to run");
}
my $data=<LOG>;
close LOG;
my $log = XMLin($data, ForceArray => 1);
my $datestr=$log->{patch}[0]->{local_date};
if (! defined $datestr) {
warn "failed to get ctime for $filer";
return 0;
}
my $date=str2time($datestr);
debug("found ctime ".localtime($date)." for $file");
return $date;
}
1
</pre>
"""]]
---
I merged the two versions above and made some fixes; it is recording my web edits in darcs and showing a recent changes page.
It is in a [darcs repository](http://joyful.com/darcsweb/darcsweb.cgi?r=ikiwiki-darcs), please send patches. --[[Simon_Michael]]
> I'd like to see at least the following fixed before I commit this: --[[Joey]]
> * Running `darcs record $filename` in backticks is not good (security)
> The thing to do is to open stdout to /dev/null before execing darcs.
> * Get `rcs_recentchanges_xml` working, parsing xml with regexps does
> not seem like a maintenance win.
> * `rcs_notify` should be removed, it's no longer used.
> * Some form of conflict handling. Using darcs to attempt to merge
> the changes is I gusss optional (although every other rcs backend,
> including svn manages to do this), but it needs to at *least* detect
> conflicts and return a page with conflict markers for the user to fix
> the conflict.
I have addressed the recentchanges bit, you can find my hacked up darcs.pm at <http://web.mornfall.net/stuff/web-root/IkiWiki/Rcs/darcs.pm>.
It's got couple of FIXMEs, and a very site-specific filter for recentchanges. Not sure how to do that better though. I will eventually add web commits, probably of my own (and mention it here).
---
And here's yet another one, including an updated `ikiwiki-makerepo`. :)
<http://khjk.org/~pesco/ikiwiki-darcs/> (now a darcs repo) <http://khjk.org/~pesco/ikiwiki-darcs/> (now a darcs repo)
> Note that there's a 'darcs' branch in git that I'm keeping a copy of your > Note that there's a 'darcs' branch in git that I'm keeping a copy of your
> code in. Just in case. :-) > code in. Just in case. :-)
I've taken all the good stuff from the above and added the missing hooks. The code hasn't seen a lot of testing, so some bugs are likely yet to surface. Also, I'm not experienced with perl and don't know where I should have used the function `possibly_foolish_untaint`. I've taken all the good stuff from the above (now deleted --[[Joey]]) and added the missing hooks. The code hasn't seen a lot of testing, so some bugs are likely yet to surface. Also, I'm not experienced with perl and don't know where I should have used the function `possibly_foolish_untaint`.
Regarding the repository layout: There are two darcs repositories. One is the `srcdir`, the other we'll call `master`.
* HTML is generated from `srcdir`.
* CGI edits happen in `srcdir`.
* The backend pulls updates from `master` into `srcdir`, i.e. darcs commits should happen to `master`.
* `master` calls ikiwiki (through a wrapper) in its apply posthook, i.e. `master/_darcs/prefs/defaults` should look like this:
apply posthook ikiwrap
apply run-posthook
(I'm not sure, should/could it be `ikiwrap --refresh` above?)
* The backend pushes CGI edits from `srcdir` back into `master` (triggering the apply hook).
* The working copies in `srcdir` and `master` should *not* be touched by the user, only by the CGI or darcs, respectively.
> Review of this one: > Review of this one:
> >
@ -523,7 +19,7 @@ Regarding the repository layout: There are two darcs repositories. One is the `s
> * `rcs_remove` just calls "rm"? Does darcs record notice the file was removed > * `rcs_remove` just calls "rm"? Does darcs record notice the file was removed
> and automatically commit the removal? (And why `system("rm")` and not > and automatically commit the removal? (And why `system("rm")` and not
> `unlink`?) > `unlink`?)
> * Is the the darcs info in [[details]] still up-to-date re this version? > * Is the the darcs info in [[rcs/details]] still up-to-date re this version?
> --[[Joey]] > --[[Joey]]
> Update: > Update:
@ -537,6 +33,8 @@ Regarding the repository layout: There are two darcs repositories. One is the `s
> this version works. It's similar, but the details differ slightly. > this version works. It's similar, but the details differ slightly.
> You could copy my description above to replace it. > You could copy my description above to replace it.
> >
>> done --[[Joey]]
>
> There is still some ironing to do, for instance the current version doesn't allow for > There is still some ironing to do, for instance the current version doesn't allow for
> modifying attachments by re-uploading them via CGI ("darcs add failed"). Am I assuming > modifying attachments by re-uploading them via CGI ("darcs add failed"). Am I assuming
> correctly that "adding" a file that's already in the repo should just be a no-op? > correctly that "adding" a file that's already in the repo should just be a no-op?