Date Tags Python

In a previous post i was discussing about PyRXP .. Now i’m playing with PyRXPU which aims to PyRXP with Unicode support … But, i’m not sure to understand the crap strings it returns:

>>> import pyRXPU
>>> pyRXPU.version
'1.05'
>>> doc = pyRXPU.Parser().parse('<foo>Bar</foo>')
>>> doc
(u'\U006f0066o\x10', None, [u'\U00610042r\x10'], None)

Seems like it’s UTF-16, but i couldn’t managed to print such strings:

>>>print doc[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'latin-1' codec can't encode character u'\U006f0066' in position 0: ordinal not in range(256)
>>>print doc[0].encode('utf16')
ÿþÛfÜo

Where’s my foo ? :)