On Apr 9, 2009, at 11:11 PM, glyph at divmod.com wrote: > I think this is a problematic way to model bytes vs. text; it gives > text a special relationship to bytes which should be avoided. > > IMHO the right way to think about domains like this is a multi-level > representation. The "low level" representation is always bytes, > whether your MIME type is text/whatever or application/x-i-dont-know. This is a really good point, and I really should be clearer when describing my current thinking (sleep would help :). > The thing that's "special" about text is that it's a "high level" > representation that the standard library can know about. But the > 'email' package ought to support being extended to support other > types just as well. For example, I want to ask for image/png > content as PIL.Image objects, not bags of bytes. Of course this > presupposes some way for PIL itself to get at some bytes, but then > you need the email module itself to get at the bytes to convert to > text in much the same way. There also needs to be layering at the > level of bytes->base64->some different bytes->PIL->Image. There are > mail clients that will base64-encode unusual encodings so you have > to do that same layering for text sometimes. > > I'm also being somewhat handwavy with talk of "low" and "high" level > representations; of course there are actually multiple levels beyond > that. I might want text/x-python content to show up as an AST, but > the intermediate DOM-parsing representation really wants to operate > on characters. Similarly for a DOM and text/html content. (Modulo > the usual encoding-detection weirdness present in parsers.) When I was talking about supporting text/* content types as strings, I was definitely thinking about using basically the same plug-in or higher level or whatever API to do that as you might use to get PIL images from an image/gif. > So, as long as there's a crisp definition of what layer of the MIME > stack one is operating on, I don't think that there's really any > ambiguity at all about what type you should be getting. In that case, we really need the bytes-in-bytes-out-bytes-in-the-chewy- center API first, and build things on top of that. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 304 bytes Desc: This is a digitally signed message part URL: <http://mail.python.org/pipermail/python-dev/attachments/20090409/25c444cd/attachment.pgp>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4