Guido van Rossum wrote: > > > It is not required by the unicodec.register() API to provide a > > subclass of these base class, only the given methods must be present; > > this allows writing Codecs as extensions types. All Codecs must > > provide the .encode()/.decode() methods. Codecs having the .read() > > and/or .write() methods are considered to be StreamCodecs. > > > > The Unicode implementation will by itself only use the > > stateless .encode() and .decode() methods. > > > > All other conversion have to be done by explicitly instantiating > > the appropriate [Stream]Codec. > > Looks okay, although I'd like someone to implement a simple > shift-state-based stream codec to check this out further. > > I have some questions about the constructor. You seem to imply > that instantiating the class without arguments creates a codec without > state. That's fine. When given a stream argument, shouldn't the > direction of the stream be given as an additional argument, so the > proper state for encoding or decoding can be set up? I can see that > for an implementation it might be more convenient to have separate > classes for encoders and decoders -- certainly the state being kept is > very different. Wouldn't it be possible to have the read/write methods set up the state when called for the first time ? Note that I wrote ".read() and/or .write() methods" in the proposal on purpose: you can of course implement Codecs which only implement one of them, i.e. Readers and Writers. The registry doesn't care about them anyway :-) Then, if you use a Reader for writing, it will result in an AttributeError... > Also, I don't want to ignore the alternative interface that was > suggested by /F. It uses feed() similar to htmllib c.s. This has > some advantages (although we might want to define some compatibility > so it can also feed directly into a file). AFAIK, .feed() and .finalize() (or .close() etc.) have a different backgound: you add data in chunks and then process it at some final stage rather than for each feed. This is often more efficient. With respest to codecs this would mean, that you buffer the output in memory, first doing only preliminary operations on the feeds and then apply some final logic to the buffer at the time .finalize() is called. We could define a StreamCodec subclass for this kind of operation. > Perhaps someone should go ahead and implement prototype codecs using > either paradigm and then write some simple apps, so we can make a > better decision. > > In any case I think the specs codec registry API aren't on the > critical path, integration of /F's basic unicode object is the first > thing we need. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 45 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4