Fredrik Lundh wrote: > > -------------------------------------------------------------------- > A PIL-like Unicode Codec Proposal > -------------------------------------------------------------------- > > In the PIL model, the codecs are called with a piece of data, and > returns the result to the caller. The codecs maintain internal state > when needed. > > class decoder: > > def decode(self, s, offset=0): > # decode as much data as we possibly can from the > # given string. if there's not enough data in the > # input string to form a full character, return > # what we've got this far (this might be an empty > # string). > > def flush(self): > # flush the decoding buffers. this should usually > # return None, unless the fact that knowing that the > # input stream has ended means that the state can be > # interpreted in a meaningful way. however, if the > # state indicates that there last character was not > # finished, this method should raise a UnicodeError > # exception. Could you explain for reason for having a .flush() method and what it should return. Note that the .decode method is not so much different from my Codec.decode method except that it uses a single offset where my version uses a slice (the offset is probably the better variant, because it avoids data truncation). > class encoder: > > def encode(self, u, offset=0, buffersize=0): > # encode data from the given offset in the input > # unicode string into a buffer of the given size > # (or slightly larger, if required to proceed). > # if the buffer size is 0, the decoder is free > # to pick a suitable size itself (if at all > # possible, it should make it large enough to > # encode the entire input string). returns a > # 2-tuple containing the encoded data, and the > # number of characters consumed by this call. Dito. > def flush(self): > # flush the encoding buffers. returns an ordinary > # string (which may be empty), or None. > > Note that a codec instance can be used for a single string; the codec > registry should hold codec factories, not codec instances. In > addition, you may use a single type or class to implement both > interfaces at once. Perhaps I'm missing something, but how would you define stream codecs using this interface ? > Implementing stream codecs is left as an exercise (see the zlib > material in the eff-bot guide for a decoder example). ...? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 44 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4