M.-A. Lemburg wrote: > Walter Dörwald wrote: > >>> I've thought about this some more. Perhaps I'm still missing >>> something, but wouldn't it be possible to add a feeding >>> mode to the existing stream codecs by creating a new queue >>> data type (much like the queue you have in the test cases of >>> your patch) and using the stream codecs on these ? >> >> >> No, because when the decode method encounters an incomplete >> chunk (and so return a size that is smaller then size of the >> input) read() would have to push the remaining bytes back into >> the queue. This would be code similar in functionality >> to the feed() method from the patch, with the difference that >> the buffer lives in the queue not the StreamReader. So >> we won't gain any code simplification by going this route. > > Maybe not code simplification, but the APIs will be well- > separated. They will not, because StreamReader.decode() already is a feed style API (but with state amnesia). Any stream decoder that I can think of can be (and most are) implemented by overwriting decode(). > If we require the queue type for feeding mode operation > we are free to define whatever APIs are needed to communicate > between the codec and the queue type, e.g. we could define > a method that pushes a few bytes back onto the queue end > (much like ungetc() in C). That would of course be a possibility. >>> I think such a queue would be generally useful in other >>> contexts as well, e.g. for implementing fast character based >>> pipes between threads, non-Unicode feeding parsers, etc. >>> Using such a type you could potentially add a feeding >>> mode to stream or file-object API based algorithms very >>> easily. >> >> Yes, so we could put this Queue class into a module with >> string utilities. Maybe string.py? > > Hmm, I think a separate module would be better since we > could then recode the implementation in C at some point > (and after the API has settled). > We'd only need a new name for it, e.g. StreamQueue or > something. Sounds reasonable. >>> We could then have a new class, e.g. FeedReader, which >>> wraps the above in a nice API, much like StreamReaderWriter >>> and StreamRecoder. >> >> But why should we, when decode() does most of what we need, >> and the rest has to be implemented in both versions? > > To hide the details from the user. It should be possible > to instantiate one of these StreamQueueReaders (named > after the queue type) and simply use it in feeding > mode without having to bother about the details behind > the implementation. > > StreamReaderWriter and StreamRecoder exist for the same > reason. Let's compare example uses: 1) Having feed() as part of the StreamReader API: --- s = u"???".encode("utf-8") r = codecs.getreader("utf-8")() for c in s: print r.feed(c) --- 2) Explicitely using a queue object: --- from whatever import StreamQueue s = u"???".encode("utf-8") q = StreamQueue() r = codecs.getreader("utf-8")(q) for c in s: q.write(c) print r.read() --- 3) Using a special wrapper that implicitely creates a queue: ---- from whatever import StreamQueueWrapper s = u"???".encode("utf-8") r = StreamQueueWrapper(codecs.getreader("utf-8")) for c in s: print r.feed(c) ---- I very much prefer option 1). "If the implementation is hard to explain, it's a bad idea." Bye, Walter Dörwald
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4