Description
The SyncPages action fails on pages with names containing non-ascii characters. (Unicode page content transfers normally.) This behaviour is encountered in Moin 1.9.3 with Python 2.7.
Steps to reproduce
Create a page with a name containing non-ascii characters (e.g., СудалгааныТанилцуулга)
Create a sync job page with a pageMatch or pageList field matching that page (e.g., pageList:: СудалгааныТанилцуулга)
Attempt to run the sync (using action=SyncPages)
Example
See traceback output below.
Component selection
SyncPages.py
- MoinMoin/wikisync.py
- xmlrpc
Details
2012-01-11 12:11:44,458 INFO MoinMoin.web.serving:41 127.0.0.1 "GET /SyncTest?action=SyncPages HTTP/1.1" 200 -
2012-01-11 12:11:56,230 ERROR MoinMoin.wsgiapp:293 An exception has occurred [http://localhost:8080/SyncTest?action=SyncPages].
Traceback (most recent call last):
File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 282, in __call__
response = run(context)
File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 88, in run
response = dispatch(request, context, action_name)
File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 136, in dispatch
response = handle_action(context, pagename, action_name)
File "/home/edt/moinmoin/MoinMoin/wsgiapp.py", line 195, in handle_action
handler(context.page.page_name, context)
File "/home/edt/moinmoin/MoinMoin/action/SyncPages.py", line 511, in execute
ActionClass(pagename, request).render()
File "/home/edt/moinmoin/MoinMoin/action/SyncPages.py", line 220, in render
self.sync(params, local, remote)
File "/home/edt/moinmoin/MoinMoin/action/SyncPages.py", line 279, in sync
r_pages = remote.get_pages(exclude_non_writable=direction != DOWN)
File "/home/edt/moinmoin/MoinMoin/wikisync.py", line 286, in get_pages
tokres, pages = m()
File "/usr/lib/python2.7/xmlrpclib.py", line 997, in __call__
return MultiCallIterator(self.__server.system.multicall(marshalled_list))
File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1575, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1264, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1292, in single_request
self.send_content(h, request_body)
File "/usr/lib/python2.7/xmlrpclib.py", line 1439, in send_content
connection.endheaders(request_body)
File "/usr/lib/python2.7/httplib.py", line 951, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 809, in _send_output
msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 550: ordinal not in range(128)MoinMoin Version |
1.9.3 |
OS and Version |
Ubuntu 11.10 |
Python Version |
2.7 |
Server Setup |
wikiserver.py |
Server Details |
|
Language you are using the wiki in (set in the browser/UserPreferences) |
English |
Workaround
Use ascii-only pagenames.
Discussion
- xmlrpclib getPage can download a non ascii filename
It works with python 2.6
If one debugs this with pydev please read PyDev, Python and system default Unicode encoding problem
Something like this could solve it.
diff -r 56eaf32027f4 wikiserver.py
--- a/wikiserver.py Tue Feb 07 21:48:50 2012 +0100
+++ b/wikiserver.py Fri Feb 17 22:30:22 2012 +0100
@@ -7,6 +7,8 @@
"""
import sys, os
+reload(sys)
+sys.setdefaultencoding("utf-8")
# a) Configuration of Python's code search path
# If you already have set up the PYTHONPATH environment variable for thehttp://stackoverflow.com/questions/3828723/why-we-need-sys-setdefaultencodingutf-8-in-py-scipt
or at
diff -r 56eaf32027f4 wikiconfig.py
--- a/wikiconfig.py Tue Feb 07 21:48:50 2012 +0100
+++ b/wikiconfig.py Fri Feb 17 22:40:50 2012 +0100
@@ -8,7 +8,7 @@
import sys, os
-from MoinMoin.config import multiconfig, url_prefix_static
+from MoinMoin.config import multiconfig, url_prefix_static, charset
class LocalConfig(multiconfig.DefaultConfig):
@@ -43,7 +43,8 @@
# Add your configuration items here.
secrets = 'This string is NOT a secret, please make up your own, long, random secret string!'
-
+ reload(sys)
+ sys.setdefaultencoding(charset)
# DEVELOPERS! Do not add your configuration items there,
# you could accidentally commit them! Instead, create a
# wikiconfig_local.py file containing this:
Changing the default encoding is evil. We should rather fix wrong data types (str vs. unicode) so it does not need to do implicit encoding/decoding. -- ThomasWaldmann 2012-02-18 15:18:27
diff -r 1ddf7d88c53d MoinMoin/wikisync.py
--- a/MoinMoin/wikisync.py Thu Mar 01 00:15:41 2012 +0100
+++ b/MoinMoin/wikisync.py Sun Mar 04 21:00:10 2012 +0100
@@ -166,7 +166,7 @@
_ = self.request.getText
wikitag, wikiurl, wikitail, wikitag_bad = wikiutil.resolve_interwiki(self.request, interwikiname, '')
- self.wiki_url = wikiutil.mapURL(self.request, wikiurl)
+ self.wiki_url = wikiutil.mapURL(self.request, wikiurl.encode("utf-8"))
self.valid = not wikitag_bad
self.xmlrpc_url = self.wiki_url + "?action=xmlrpc2"
if not self.valid:
We can encode the url too
in 2.7 httplb does
if isinstance(message_body, str):
msg += message_body
message_body = None
self.send(msg)it assumes that if message_body is instance of str that also msg is from the same type. This is not in our current code. Without the change the url is unicode.
In syncpages we have often hardcoded utf-8 for decoding and not config.charset. Also on lots of other places we encode urlparts.
Please paste your full wikisync config page. -- AlexanderSchremmer
Proposed solution (please try it, didn't test it):
diff -r 1ddf7d88c53d MoinMoin/wikisync.py
--- a/MoinMoin/wikisync.py Thu Mar 01 00:15:41 2012 +0100
+++ b/MoinMoin/wikisync.py Sun Mar 04 21:00:10 2012 +0100
@@ -166,7 +166,7 @@
_ = self.request.getText
wikitag, wikiurl, wikitail, wikitag_bad = wikiutil.resolve_interwiki(self.request, interwikiname, '')
self.wiki_url = wikiutil.mapURL(self.request, wikiurl)
self.valid = not wikitag_bad
- self.xmlrpc_url = self.wiki_url + "?action=xmlrpc2"
+ self.xmlrpc_url = str(self.wiki_url + "?action=xmlrpc2") # url MUST be str. unicode would lead to msg_body decoding issues in py 2.7 httplib.
if not self.valid:
Plan
- Priority:
- Assigned to:
Status: fixed with http://hg.moinmo.in/moin/1.9/rev/4541d744d740
