>>From my first impression, I'm > not too thrilled by the prospect of making the Unicode implementation > more complicated by having three different representations on each > object. Thanks, added as a concern. > I also don't see how this could save a lot of memory. As an example > take a French text with say 10mio code points. This would end up > appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB), > one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending > on how many accents are used). That's a saving of -10MB compared to > today's implementation :-) As others have pointed out: that's not how it works. It actually *will* save memory, since the alternative representations are optional. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4