Geoffrey Sneddon wrote: > Yeah, I started an entire Unicode implementation in userland PHP. > Let's just say it became rather large while getting nowhere. :)
So, the trick is to only use strpos() on well-formed UTF-8, and you're golden. :-) > This isn't really a case of the built-in implementation not working, > it's just the built-in implementation is defined to use either UCS2 or > UCS4 depending on a compile-time flag, which can end up being rather > fun to deal with (look at ifragment in anolislib/utils.py in Anolis > for example). That an absolutely horrid piece of code, having to match for surrogate pairs yourself. Does Python use PCRE, by any chance? Cheers, Edward --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4