I'm still woefully behind on my email since returning from vacation, but I thought I'd rehash a bit on PEP 215, string interpolation, given some recent hacking and thinking about stuff we talked about at IPC10. Background: PEP 215 has some interesting ideas, but IMHO is more than I'm comfortable with. At IPC10, Guido described his rules for string interpolation as they would be if his time machine were more powerful. These follow some discussions we've had during various Zope sprints about making the rules simpler for non-programmers to understand. I've also been struggling with how error prone %(var)s substitutions can be in the thru-the-web Mailman strings where this is supported. Here's what I've come up with. Guido's rules for $-substitutions are really simple: 1. $$ substitutes to just a single $ 2. $identifier followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the substitution dictionary. 3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, ${identfier} is equivalent to $identifier. And that's it. For the sake of discussion, forget about where the dictionary for string interpolation comes from. I've hacked together 4 functions which I'm experimentally using to provide these rules in thru-the-web string editing, and also for sanity checking the strings as they're submitted. I think there's a fairly straightforward conversion between traditional %-strings and these newfangled $-strings, and so two of the functions do the conversions back and forth. The second two functions attempt to return a list of all the substitution variables found in either a %-string or a $-string. I match this against the list of known legal substitution variables, and bark loudly if there's some mismatch. The one interesting thing about %-to-$ conversion is that the regexp I use leaves the trailing `s' in %(var)s as optional, so I can auto-correct for those that are missing. I think this was an idea that Paul Dubois came up with during the lunch discussion. Seems to work well, and I can do a %-to-$-to-% roundtrip; if the strings at the ends are the same then there wasn't any missing `s's, otherwise the conversion auto-corrected and I can issue a warning. This is all really proto-stuff, but I've done some limited testing and it seems to work pretty well. So without changing the language we can play with $-strings using Guido's rules to see if we like them or not, by simply converting them to traditional %-strings manually, and then doing the mod-operator substitutions. Hopefully I've extracted the right bits of code from my modules for you to get the idea. There may be bugs <wink>. -Barry -------------------- snip snip -------------------- import re from string import digits try: # Python 2.2 from string import ascii_letters except ImportError: # Older Pythons _lower = 'abcdefghijklmnopqrstuvwxyz' ascii_letters = _lower + _lower.upper() # Search for $(identifier)s strings, except that the trailing s is optional, # since that's a common mistake cre = re.compile(r'%\(([_a-z]\w*?)\)s?', re.IGNORECASE) # Search for $$, $identifier, or ${identifier} dre = re.compile(r'(\${2})|\$([_a-z]\w*)|\${([_a-z]\w*)}', re.IGNORECASE) IDENTCHARS = ascii_letters + digits + '_' EMPTYSTRING = '' # Utilities to convert from simplified $identifier substitutions to/from # standard Python $(identifier)s substititions. The "Guido rules" for the # former are: # $$ -> $ # $identifier -> $(identifier)s # ${identifier} -> $(identifier)s def to_dollar(s): """Convert from %-strings to $-strings.""" s = s.replace('$', '$$') parts = cre.split(s) for i in range(1, len(parts), 2): if parts[i+1] and parts[i+1][0] in IDENTCHARS: parts[i] = '${' + parts[i] + '}' else: parts[i] = '$' + parts[i] return EMPTYSTRING.join(parts) def to_percent(s): """Convert from $-strings to %-strings.""" s = s.replace('%', '%%') parts = dre.split(s) for i in range(1, len(parts), 4): if parts[i] is not None: parts[i] = '$' elif parts[i+1] is not None: parts[i+1] = '%(' + parts[i+1] + ')s' else: parts[i+2] = '%(' + parts[i+2] + ')s' return EMPTYSTRING.join(filter(None, parts)) def dollar_identifiers(s): """Return the set (dictionary) of identifiers found in a $-string.""" d = {} for name in filter(None, [b or c or None for a, b, c in dre.findall(s)]): d[name] = 1 return d def percent_identifiers(s): """Return the set (dictionary) of identifiers found in a %-string.""" d = {} for name in cre.findall(s): d[name] = 1 return d -------------------- snip snip -------------------- Python 2.2 (#1, Dec 24 2001, 15:39:01) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import dollar >>> dollar.to_dollar('%(one)s %(two)three %(four)seven') '$one ${two}three ${four}even' >>> dollar.to_percent(dollar.to_dollar('%(one)s %(two)three %(four)seven')) '%(one)s %(two)sthree %(four)seven' >>> dollar.percent_identifiers('%(one)s %(two)three %(four)seven') {'four': 1, 'two': 1, 'one': 1} >>> dollar.dollar_identifiers(dollar.to_dollar('%(one)s %(two)three %(four)seven')) {'four': 1, 'two': 1, 'one': 1}
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4