Thank you, Martin, you have been patient and persistent. I've finally started to sort out some of my problems. It appears that I confused \x and \u while frustratedly trying to move forward;-) Today I finally started to have some progress. My problem was that importing text files with unicode escapes really does require codecs (importing text files, not Python scripts). (UNIESCAPE_encode, UNIESCAPE_decode, UNIESCAPE_streamreader, UNIESCAPE_streamwriter)=codecs.lookup('raw-unicode-escape') onea = UNIESCAPE_streamreader(codecs.open('H:\\jy\\encodings\\KSCIIOne.txt','r','raw-unicode-escape')) oneencoding = onea.read() This stream I can then read line by line by using: for encodingline in oneencoding.split('\n'): My confusion was the apparently redundant use of raw-unicode-escape (which allows reading \u1780 type of characters). Export (I decided to use the more easily readable utf-16 encoding) similarly requires: (UTF16_encode, UTF16_decode, UTF16_streamreader, UTF16_streamwriter) = codecs.lookup('utf-16') outdocument = UTF16_streamwriter( codecs.open('h:\\jy\\outtest.txt','wb','utf-16' )) I trust this helps someone else who confronts similar problems;-) Cheers, Maurice Martin von Loewis wrote: > Maurice Bauhahn <bauhahnm at clara.net> writes: > > > My problem is importing such escapes from a file. Can you do that? I > > note also that you are using version 2.0 which is not documented to > > have the two hex character limitation. > > Just have a look at Fredrik Lundh's example. He *did* load the strings > from a file using execfile. > > I guess you'll have to supply an exact test case, with all the files > you've used, the file names you've given to them, the Jython version, > the JDK version, the operating system, your day of birth, and so on. > Otherwise, nobody can guess what you attempt to do, or why it fails > (assuming it does fail...) > > Regards, > Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4