On approximately 12/8/2008 12:57 AM, came the following characters from the keyboard of Stephen J. Turnbull: > "Internal decoding" is (or should be) an oxymoron. Why would your > software be passing around text in any format other than internal? So > decoding will happen (a) on I/O, which is itself almost certainly > slower than making a few checks for Unicode hygiene, or (b) on receipt > of data from other software that whose sanitation you shouldn't trust > more than you trust the Internet. > > Encoding isn't a problem, AFAICS. So I can see validating user supplied data, which always comes in via I/O. But during manipulation of internal data, including file and database I/O, there is a need for encoding and decoding also. If all the data has already been validated, then there would be no need to revalidate on every conversion. I hear you when you say that clever coding can make the validation nearly free, and I applaud that: the UTF-8 coder that I wrote predated most of the rules that have been created since, so I didn't attempt to be clever in that regard. Thanks to you and Adam for your explanations; I see your points, and if it is nearly free, I withdraw most of my negativity on this topic. -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4