On 2011-08-23, at 10:55 , Martin v. Löwis wrote: >> - “The UTF-8 decoding fast path for ASCII only characters was removed >> and replaced with a memcpy if the entire string is ASCII.” >> The fast path would still be useful for mostly-ASCII strings, which >> are extremely common (unless UTF-8 has become a no-op?). > > Is it really extremely common to have strings that are mostly-ASCII but > not completely ASCII? I would agree that pure ASCII strings are > extremely common. Mostly ascii is pretty common for western-european languages (French, for instance, is probably 90 to 95% ascii). It's also a risk in english, when the writer "correctly" spells foreign words (résumé and the like).
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4