On Sat, 2003-04-12 at 07:43, Martin v. Löwis wrote: > More or less, yes. Now, what happens if you pot "real" non-ASCII > (i.e. bytes above 127) into the message id, like so: But I don't think you'd ever want to do that. In fact, I think in general you're probably talking about ascii msgids or utf-8 encoded Unicode msgids. I'm not sure what else would make sense. > msgfmt will still accept that, but msgunfmt will complain: Didn't even know about msgunfmt. :) > msgunfmt: warning: The following msgid contains non-ASCII characters. > This will cause problems to translators who use a > character encoding different from yours. Consider > using a pure ASCII msgid instead. > > If you think about this, this is really bad: If you mean to apply the > charset= to both msgid and msgstr, then translators using a different > charset from yours are in big trouble. Right, but see above. E.g. if your string literals are all Spanish and you want a Turkish translation, then utf-8 is the only common encoding you could possibly use in a .po file, right? > They are faced with three problems: > 1. They don't know what the charset of the msgids is. The PO files do > have a charset declaration, the POT files typically don't. Yep, although it would be easy for the extractor to add a charset=utf-8 to the pot file. > 2. They need to convert the msgids from the POT encoding to their > native encoding. There are no tools available to support that readily; > tools like iconv might correctly convert the msgids, but won't update > the charset= in the POT file (if the charset was filled out). > 3. By converting the msgids, they are also changing them. That means > the msgids are not really suitable as keys anymore. Is this still a problem for when charset=utf-8? -Barry
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4