On Wed, 24 Nov 2010 18:51:49 +0900 "Stephen J. Turnbull" <stephen at xemacs.org> wrote: > James Y Knight writes: > > > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly > > superior [...]a because it is an ASCII superset, and thus more > > easily compatible with other software. That also makes it most > > commonly used for internet communication. > > Sure, UTF-8 is very nice as a protocol for communicating text. So > what? If your application involves shoveling octets real fast, don't > convert and shovel those octets. If your application involves > significant text processing, well, conversion can almost always be > done as fast as you can do I/O so it doesn't cost wallclock time, and > generally doesn't require a huge percentage of CPU time compared to > the actual text processing. It's just a specialization of > serialization, that we do all the time for more complex data > structures. > > So wire protocols are not a killer argument for or against any > particular internal representation of text. Agreed. Decoding and encoding utf-8 is so fast that it should be dwarfed by any actual processing done on the text. Regards Antoine.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4