Andreas Beyer wrote: > The documentation of string.split() says: > "... The returned list will then have one more item than the number of > non-overlapping occurrences of the separator in the string. ..." > > The behaviour of split with Python 2.3.3 is: > >>> '\tb'.split() > ['b'] # Bug? > >>> '\tb'.split('\t') > ['', 'b'] > >>> 'a\t\tb'.split() > ['a', 'b'] > >>> 'a\t\tb'.split('\t') > ['a', '', 'b'] > >>> > > I think there are different interpretations of what a separator is. That > is not necessarily a bug, because without stripping a new-line at the > end of the string would yield a non-intuitive result list. However, the > difference between split with and without the 'sep' argument should be > documented. This is intended behavior and is not going to be changed. However, I must agree that the documentation for the string method is somewhat lacking. The documentation of the split function in the string module is much clearer. Method: split([sep [,maxsplit]]) Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or None, any whitespace string is a separator. string.split function: split(s[, sep[, maxsplit]]) Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string. The optional third argument maxsplit defaults to 0. If it is nonzero, at most maxsplit number of splits occur, and the remainder of the string is returned as the final element of the list (thus, the list will have at most maxsplit+1 elements). -- Sjoerd Mullender <sjoerd at acm.org>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4