Walter Dörwald wrote: > M.-A. Lemburg wrote: > >> Walter Dörwald wrote: >> >>> Let's compare example uses: >>> >>> 1) Having feed() as part of the StreamReader API: >>> --- >>> s = u"???".encode("utf-8") >>> r = codecs.getreader("utf-8")() >>> for c in s: >>> print r.feed(c) >>> --- >> >> >> I consider adding a .feed() method to the stream codec >> bad design. .feed() is something you do on a stream, not >> a codec. > > > I don't care about the name, we can call it > stateful_decode_byte_chunk() or whatever. (In fact I'd > prefer to call it decode(), but that name is already > taken by another method. Of course we could always > rename decode() to _internal_decode() like Martin > suggested.) It's not that name that doesn't fit, it's the fact that you are mixing a stream action into a codec which I'd rather see well separated. >>> 2) Explicitely using a queue object: >>> --- >>> from whatever import StreamQueue >>> >>> s = u"???".encode("utf-8") >>> q = StreamQueue() >>> r = codecs.getreader("utf-8")(q) >>> for c in s: >>> q.write(c) >>> print r.read() >>> --- >> >> >> This is probably how an advanced codec writer would use the APIs >> to build new stream interfaces. > > > > >>> 3) Using a special wrapper that implicitely creates a queue: >>> ---- >>> from whatever import StreamQueueWrapper >>> s = u"???".encode("utf-8") >>> r = StreamQueueWrapper(codecs.getreader("utf-8")) >>> for c in s: >>> print r.feed(c) >>> ---- >> >> >> >> This could be turned into something more straight forward, >> e.g. >> >> from codecs import EncodedStream >> >> # Load data >> s = u"???".encode("utf-8") >> >> # Write to encoded stream (one byte at a time) and print >> # the read output >> q = EncodedStream(input_encoding="utf-8", output_encoding="unicode") > > > This is confusing, because there is no encoding named "unicode". > This should probably read: > > q = EncodedQueue(encoding="utf-8", errors="strict") Fine. I was thinking of something similar to EncodedFile() which also has two separate encodings, one for the file side of things and one for the Python side. >> for c in s: >> q.write(c) >> print q.read() >> >> # Make sure we have processed all data: >> if q.has_pending_data(): >> raise ValueError, 'data truncated' > > > This should be the job of the error callback, the last part should > probably be: > > for c in s: > q.write(c) > print q.read() > print q.read(final=True) Ok; both methods have their use cases. (You seem to be obsessed with this final argument ;-) >>> I very much prefer option 1). >> >> >> I prefer the above example because it's easy to read and >> makes things explicit. >> >>> "If the implementation is hard to explain, it's a bad idea." >> >> >> The user usually doesn't care about the implementation, only it's >> interfaces. > > > Bye, > Walter Dörwald > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/mal%40egenix.com -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 19 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4