Hello, Many times I find myself asking for a slice of a specific length, rather than a slice with a specific end. I suggest to add the syntax object[start:>length] (or object[start:>length:jump]), beside the existing syntax. Two examples: 1. Say I have a list with the number of panda bears hunted in each month, starting from 1900. Now I want to know how many panda bears were hunted in year y. Currently, I have to write something like this: sum(huntedPandas[(y-1900)*12:(y-1900)*12+12]) If my suggestion is accepted, I would be able to write: sum(huntedPandas[(y-1900)*12:>12]) 2. Many data files contain fields of fixed length. Just an example: say I want to get the color of the first pixel of a 24-bit color BMP file. Say I have a function which gets a 4-byte string and converts it into a 32-bit integer. The four bytes, from byte no. 10, are the size of the header, in bytes. Right now, if I don't want to use temporary variables, I have to write: picture[s2i(picture[10:14]):s2i(picture[10:14])+4] I think this is nicer (and quicker): picture[s2i(picture[10:>4]):>4] (I mean to show that when working with data files, it's common to have slices of specific length, and that the proposed syntax makes things clear and simple. I took BMP just as an example - I know about PIL.) Other solutions (from comp.lang.python responses): 1. Of course, the longer form may be used, and a temporary variable may be used to avoid repeated function calls. 2. The idiom object[start:][:length] may be used. However, it may be very inefficient, if the list is long. Another advantage of the proposed syntax is that it can be used in multi-dimensional slices (for example, ar[:,x:>3,:]) 3. The programmer may define the function lambda object, start, length: object[start:start+length]. This does make expressions quite short, but it isn't very readable IMHO, and doesn't deal with multi-dimensional slices. Objections (also from comp.lang.python): 1. There should be only one way to do something in Python. 2. Some don't like how it looks. 3. l[a:b] yields an empty list when a>b, and l[a:>b] doesn't. My responses: 1. Changes should be taken seriously, and the language must be kept simple and easy to read, but it doesn't mean that there should be only one way to do something. Just an example: you could write l[:,:,:,3], but the ellipsis token lets you write l[...,3]. 2. I can't really argue with that, besides saying that it looks fine to me; The symbol '>' generally means "move to the right". I think that l[12345:>10] can easily be read as "start from 12345, and move 10 steps to the right. Take all the items you passed over." 3. l[a:>b] doesn't look like l[a:b] and it means something altogether different. Besides, l[a:b:-1] doesn't yield an empty list when a > b. Some technical details: My proposal only affects the conversion from Python code into byte-code. This is why it is easy to implement and has no side effects, as far as I can see. I changed the definition of "subscript" in the Grammar file from: subscript: '.' '.' '.' | test | [test] ':' [test] [sliceop] into: subscript: '.' '.' '.' | test | ([test] ':' [test] | test ':>' test) [sliceop] and added the ':>' token to tokenize.c and token.h. I then extended compile.c to handle the new syntax. The byte code produced is basically simple: Calculate start, calculate length, and add start to length to get the usual start, end. It gets a bit complicated because you want range(10)[3:>-5], for example, to yield an empty list, and using the method described, it will be equivalent to range(10)[3:-2], that is, to [3,4,5,6,7]. So the byte-code my implementation produces checks to see if the resulting end is negative and start is positive, and if so, puts -sys.maxint, instead of start+length, as end. -sys.maxint is used instead of the more obvious choice, 0, so that range(10)[3:>-5:-1] will yield [3,2,1,0] and not [3,2,1]. This can be optimized, because I expect that usually length will be an integer given explicitly in the Python code, in which case no testing has to be done in the byte-code. Attached are the 4 diffs. I'm sorry, they are against the Python-2.3.3 release (the sourceforge CVS doesn't work for me currently), but I expect them to work fine with the CVS head. To summerize, this is a small addition, with no side-effects or backward-compatibility issues, which will help me and others. Well, what do you think? I would like to hear your comments. Best wishes, Noam Raphael -------------- next part -------------- A non-text attachment was scrubbed... Name: Grammar.diff Type: text/x-patch Size: 808 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040604/cf70e9a5/Grammar.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: compile.c.diff Type: text/x-patch Size: 2278 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040604/cf70e9a5/compile.c.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: token.h.diff Type: text/x-patch Size: 767 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040604/cf70e9a5/token.h.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: tokenizer.c.diff Type: text/x-patch Size: 550 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20040604/cf70e9a5/tokenizer.c.bin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4