On 6/22/2010 1:22 AM, Glyph Lefkowitz wrote: > The thing that I have heard in passing from a couple of folks with > experience in this area is that some older software in asia would > present characters differently if they were originally encoded in a > "japanese" encoding versus a "chinese" encoding, even though they were > really "the same" characters. As I tried to say in another post, that to me is similar to wanting to present English text is different fonts depending on whether spoken by an American or Brit, or a modern person versus a Renaissance person. > I do know that Han Unification is a giant political mess > (<http://en.wikipedia.org/wiki/Han_unification> makes for some Thanks, I will take a look. > interesting reading), but my understanding is that it has handled enough > of the cases by now that one can write software to display asian > languages and it will basically work with a modern version of unicode. > (And of course, there's always the private use area, as Stephen Turnbull > pointed out.) > > Regardless, this is another example where keeping around a string isn't > really enough. If you need to display a japanese character in a distinct > way because you are operating in the japanese *script*, you need a tag > surrounding your data that is a hint to its presentation. The fact that > these presentation hints were sometimes determined by their encoding is > an unfortunate historical accident. Yes. The asian languages I know anything about seems to natively have almost none of the symbols English has, many borrowed from math, that have been pressed into service for text markup. -- Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4