On 05 March 2000, Guido van Rossum said: > - Variants on the syntax could be given through some kind of option > system rather than through subclassing -- they should be combinable > independently. Som possible options (maybe I'm going overboard here) > could be: > > - comment characters: ('#', ';', both, others?) > - comments after variables allowed? on sections? > - variable characters: (':', '=', both, others?) > - quoting of values with "..." allowed? > - backslashes in "..." allowed? > - does backslash-newline mean a continuation? > - case sensitivity for section names (default on) > - case sensitivity for option names (default off) > - variables allowed before first section name? > - first section name? (default "main") > - character set allowed in section names > - character set allowed in variable names > - %(...) substitution? I agree with Fred that this level of flexibility is probably overkill for a config file parser; you don't want every application author who uses the module to have to explain his particular variant of the syntax. However, if you're interested in a class that *does* provide some of the above flexibility, I have written such a beast. It's currently used to parse the Distutils MANIFEST.in file, and I've considered using it for the mythical Distutils config files. (And it also gets heavy use in my day job.) It's really a class for reading a file in preparation for "text processing the Unix way", though: it doesn't say anything about syntax, it just worries about blank lines, comments, continuations, and a few other things. Here's the class docstring: class TextFile: """Provides a file-like object that takes care of all the things you commonly want to do when processing a text file that has some line-by-line syntax: strip comments (as long as "#" is your comment character), skip blank lines, join adjacent lines by escaping the newline (ie. backslash at end of line), strip leading and/or trailing whitespace, and collapse internal whitespace. All of these are optional and independently controllable. Provides a 'warn()' method so you can generate warning messages that report physical line number, even if the logical line in question spans multiple physical lines. Also provides 'unreadline()' for implementing line-at-a-time lookahead. Constructor is called as: TextFile (filename=None, file=None, **options) It bombs (RuntimeError) if both 'filename' and 'file' are None; 'filename' should be a string, and 'file' a file object (or something that provides 'readline()' and 'close()' methods). It is recommended that you supply at least 'filename', so that TextFile can include it in warning messages. If 'file' is not supplied, TextFile creates its own using the 'open()' builtin. The options are all boolean, and affect the value returned by 'readline()': strip_comments [default: true] strip from "#" to end-of-line, as well as any whitespace leading up to the "#" -- unless it is escaped by a backslash lstrip_ws [default: false] strip leading whitespace from each line before returning it rstrip_ws [default: true] strip trailing whitespace (including line terminator!) from each line before returning it skip_blanks [default: true} skip lines that are empty *after* stripping comments and whitespace. (If both lstrip_ws and rstrip_ws are true, then some lines may consist of solely whitespace: these will *not* be skipped, even if 'skip_blanks' is true.) join_lines [default: false] if a backslash is the last non-newline character on a line after stripping comments and whitespace, join the following line to it to form one "logical line"; if N consecutive lines end with a backslash, then N+1 physical lines will be joined to form one logical line. collapse_ws [default: false] after stripping comments and whitespace and joining physical lines into logical lines, all internal whitespace (strings of whitespace surrounded by non-whitespace characters, and not at the beginning or end of the logical line) will be collapsed to a single space. Note that since 'rstrip_ws' can strip the trailing newline, the semantics of 'readline()' must differ from those of the builtin file object's 'readline()' method! In particular, 'readline()' returns None for end-of-file: an empty string might just be a blank line (or an all-whitespace line), if 'rstrip_ws' is true but 'skip_blanks' is not.""" Interested in having something like this in the core? Adding more options is possible, but the code is already on the hairy side to support all of these. And I'm not a big fan of the subtle difference in semantics with file objects, but honestly couldn't think of a better way at the time. If you're interested, you can download it from http://www.mems-exchange.org/exchange/software/python/text_file/ or just use the version in the Distutils CVS tree. Greg
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4