User documentation

Please read HelpOnSynchronisation.

Open Issues

Known main issues

Longterm ToDo

Basic Requirements

Cases

One page has new revisions

Both pages have new revisions

Deleted pages

Assuming that in wiki A the page exists and it was deleted in wiki B, we do a normal merge if there are no tags. If there were changes in Wiki A, there is a merge with a conflict. Otherwise (no changes past last merge), the page is deleted in Wiki A. This needs static info that could be transferred with the pagelist.

Implementation

Based on XMLRPC.

Every page will have multiple tags by different wikis. Based on those, the synchronisation code can decide how to build the differences.

Tags

They are tuples associated to a page. See Tag in MoinMoin.wikisync for an overview.

Prerequisites

RPC:getToken(username, password)

An RPC call which returns a token that can be used to authenticate at a remote wiki. This token might have a limited time span.

RPC:applyToken(token)

An RPC wrapper method that is used to transmit the token and call the original function. Needs to be used in a multicall batch request.

RPC:batchRequest

Was replaced by the standardised MultiCall method.

Authentication agent

Other wikis should not trust other wikis in general but users they already know. This requires credentials to be sent. MoinMoin needs a simple action that manages logins per user for other wikis.

XMLRPC Interface

Docs moved to source code.

getDiff

Returns a computed diff of a page.

mergeDiff

mergeDiff(pagename, contents, localrev, deltaremoterev, lastremoterev, interwikiname, normalised_name) returns {status: "SUCCESS"/..., current: intRev}

Merges a diff on the remote machine and returns the number of the new revision. Additionally, this method tags the new revision.

Synchronisation Steps

DVCS like tracking

Distributed version control systems are known to handle syncronisation cases with many committers and repositories quite well. So does that go together with wikis well? In this section I want to discuss if a DVCS like Mercurial could be used a base component for building a syncronisation system.

Introduction

In a distributed version control system (DVCS), every node (i.e. local repository) that generates commits holds the revision data of commits done by other nodes as well. So it can do merging etc. locally without contacting other nodes. Furthermore, there is no need for a special server or a specific push/pull/sync strategy because all nodes have enough history to allow for complex syncronisation scenarios. By grouping all nodes into a graph based on which node merges with other nodes, we can describe a particular setup of wikis.

Limitations

In the aforementioned system, it is not possible to reorder the graph of nodes without getting merge conflicts because the tags are not distributed across merges and every nodes just knows it's own history and tags. So the calculation of the parent revision might fail if a node tries to merge directly with another node that only indirectly merged with the original node in the past.

Modelling using a DVCS

So the upcoming question is: how can we map the model of a DVCS to edits done in the wiki? It is preferable not to change the workflow too much. E.g. it is not easy to have the notion of multiple concurrent heads (of the revision DAG) in a wiki without many changes in the whole wiki system (even though it would make sense - to have e.g. a kind of "staging" system where anonymous users see the stable head of the wiki pages while skilled editors can modify the unstable head and merge - or copy, if the stable head is read-only - when it has gotten stable enough).

Compared to a DVCS, in a wiki, there are no "transactions" that span multiple pages/items (except the rename of a page). So a commit would only contain one changed page. Furthermore, merging should be done on a page-level in order to allow partly merges (in order to be able to e.g. restrict merges to namespaces separated by different parent pages, like described in the aforementioned UI).

I am assuming that the reader is aware of the DVCS called Mercurial and will use the terms and model defined by it in the following thoughts. As said above, merges should be done on a per-page level. It might be possible to send a (pagename, [heads...]) for every page and allow for O(no_of_pages*no_of_heads*log(no_of_revs)) for pulling in that case.

Mercurial operations to implement: annotate, pull, push, commit, merge/update, incoming, outgoing, parents, heads

/!\ The idea to follow this implementation strategy was dropped after talking to my mentor. So we will just have syncronisation of merge changesets for now.

Discussion

See also

Moin Wiki Attachments Backup

With the interwiki synch procedure it is possible to backup (synchronize) to a backup wiki via network, while the wiki is on-line.

Unfortunately up to moinmoin version 1.9.3 attachments to a wiki page can not be synchronized. For that job I have setup a little shell script with a rsync procedure.
You just have to take care, that the actual version 3.07 of rsync is installed in /usr/bin/rsync.

Using rsync to backup attachments

The Linux program rsync looks very good suited for that task. I tried it with the GUI luckyBackup (ver. 0.4.4) under Ubuntu 10.10 on the client computer.
Only the missing attachments are copied to the backup wiki.
The backup wiki has access to the main wiki via in house network. So I could mount the home folder via cifs.

1. Enable the home folder of the main wiki for remote access (Ubuntu 10.04.1)

2. Mount the remote home folder to /media/rudi72home:
   sudo mount -t cifs -o username=rudi,password=<password> //rudiswiki/rudi72home /media/rudi72home

A list will show the setup of luckyBackup:

Task Properties:
Task name: rudi72_moin18
Type: Backup the entire Source directory (by name)
Source: /media/rudi72home/moin-1.8.8/wiki/data/pages
Destination: /home/rudi/moin-1.8.8/wiki/data/

advanced properties:
Include: */attachments/*
Set a mark on:
  Preserve ownership, times
  Preserve permissions
  Preserve symlinks
  Recurse into directories
  Skip newer destination files

This will give the following rsync command string via the "validate" button:
rsync -h --progress --stats -r -tgo -p -l --update --include=*/attachments/* --include=*/ --exclude=* --prune-empty-dirs /media/rudi72home/moin-1.8.8/wiki/data/pages /home/rudi/moin-1.8.8/wiki/data/

rsync use on Mac OS X 10.6.6

A problem will arise, if on the backup side the folder attachments does not exist. Then there is no copying of the attachments files.
The problem was caused from an outdated rsync version (--version 2.6.9) in MAC OS X. A newer version (--version 3.07) works as expected.

In order to make the connection to the Linux Ubuntu 10.04.1 server easier, install there the package netatalk.
Because netatalk uses as name for the home directory HOME DIRECTORY with a space in the name, I changed it in /etc/netatalk/AppleVolumes.default to home_dir, which should be easier in concern of file handling.

The rsync call with an established server connection will look like:

# rsync ver. 3.0.7 in /usr/bin/rsync (old ver. 2.6.9)
rsync  -h --progress --stats -r -tgo -p -l --update --include=*/attachments/* --include=*/ --exclude=* --prune-empty-dirs /Volumes/home_dir/moin-1.8.8/wiki/data/pages /Volumes/hda8/INSTALL/Python/Moin/moin-1.8.8/wiki/data/

ssh use on Mac OS X

If you want to copy the attachments via Internet, it is recommended to use SSH encryption for data transfer. The original description is from http://www.jdmz.net/ssh/. Deviating from that, the Mac OS X ssh-keygen does allow a DSA 1024 bit key only.
If the ssh connection does work, and you want to use the rsync procedure via script automated, you have to generate a public DSA key.
In order to make the procedure more save than I have described, please have a look to the original web site.

For a quick test of the working ssh connection try with option -e ssh, with interactive password input:

# rsync ver. 3.0.7 with ssh
rsync  -h --progress --stats -r -tgo -p -l --update --include=*/attachments/* --include=*/ --exclude=* --prune-empty-dirs  -e ssh rudi@192.168.17.72:/home/rudi/moin-1.8.8/wiki/data/pages /Volumes/hda8/INSTALL/Python/Moin/moin-1.8.8/wiki/data/

Next generate a public key, in order to be able to automate the rsync procedure.

At the question for the passphrase just hit ENTER.

# First create the DSA keys
$ ssh-keygen -t dsa -b 1024 -f mac-rsync-key
Generating public/private dsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in mac-rsync-key.
Your public key has been saved in mac-rsync-key.pub.
The key fingerprint is:
d5:04:b4:03:0d:c6:0d:d5:d4:10:85:e9:da:f7:xx:xx rudiuser@rudi-users-iMac.local
The key's randomart image is:
+--[ DSA 1024]----+
|       .=B++=B.  |
...
|               o.|
+-----------------+

# Copy the public key to the server:
$ scp mac-rsync-key.pub rudi@192.168.17.72:/home/rudi
remote password: <password>

# Login to remote computer:
$ ssh rudi@192.168.17.72
remote password: <password>

# If folder .ssh does not exist in the home directory, create it
$ sudo mkdir .ssh ; chmod 700 .ssh

$ sudo mv mac-rsync-key.pub .ssh/

$ sudo cd .ssh/

# If file "authorized_keys" does not exist, create it
$ sudo touch authorized_keys ; chmod 600 authorized_keys

# Insert content of "mac-rsync-key.pub"
$ sudo cat mac-rsync-key.pub >> authorized_keys

# Now on the host side you can start "rsync" procedure without password question:
# rsync ver. 3.0.7 with ssh
rsync  -h --progress --stats -r -tgo -p -l --update --include=*/attachments/* --include=*/ --exclude=* --prune-empty-dirs  -e "ssh -i mac-rsync-key" rudi@192.168.17.72:/home/rudi/moin-1.8.8/wiki/data/pages /Volumes/hda8/INSTALL/Python/Moin/moin-1.8.8/wiki/data/

-- RudolfReuter 2011-02-13 18:59:52

(!) Have a look at ssh-copy-id

  1. See above. (1)

MoinMoin: WikiSynchronisation (last edited 2021-02-13 13:48:14 by Bind)