Alexander Belopolsky wrote: > """ > Because the most commonly used characters are all in the Basic > Multilingual Plane, converting between surrogate pairs and the > original values is often not tested thoroughly. This leads to > persistent bugs, and potential security holes, even in popular and > well-reviewed application software. > """ Maybe Python should have used UTF-8 as its internal unicode representation. Then people who were foolish enough to assume one character per string item would have their programs break rather soon under only light unicode testing. :-) -- Greg
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4