Things to do, and the people doing them
Contents
Determine wiki structure from export XML
Who is working on this: BradleyDean, PaulBoddie
Given the XML structure in the Confluence exports, extract the site structure (including pages, attachments and history).
I've written some experimental code to export page revisions and manifests from the XML dump (convert.py), along with a module (parser.py) that performs some simple parsing of page text given to it on standard input. The idea is to combine the manifests and give them to the package installer in order to import the Wiki content into Moin, but only after the actual page revisions have been parsed and converted to Moin syntax. -- PaulBoddie 2012-04-01 22:45:46
I forgot to include the xmlread module, but I'll upload that later today. -- PaulBoddie 2012-04-02 08:09:41
The missing module is now available here. You can just copy xmlread.py into the ConfluenceConverter distribution and it should work. -- PaulBoddie 2012-04-02 16:08:15
What confluence markup is being used?
Who is working on this:
So we know what work needs to be done, find out what subset of the confluence markup is being used in the mailman wiki.
Parse confluence markup into DOM/AST-structure
Who is working on this: AlekseyZapparov
NOTE: The DOM/AST structure will need to be agreed upon between this and the moinmoin output step
Given raw confluence markup (just the page content, extracted from the XML structure) parse the data and store in some sort of DOM/AST style form.
MoinMoin output from parsed data
Who is working on this: AlekseyZapparov
NOTE: The DOM/AST structure will need to be agreed upon between this and the parsing step
Given the parsed content, generate raw MoinMoin markup
Notes from MoinMoin devs
If you are going the DOM way (parsing stuff into a DOM tree, generating moin markup from that DOM tree), you should use the same DOM tree as moin2 does. There's a moinwiki_out converter already for that DOM tree, so you save half of the work.
