RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/attachments/20140604/45a0203d/attachment.html below:

<html>
 <head>
 <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
 </head>
 <body bgcolor="#FFFFFF" text="#330033">
 <div class="moz-cite-prefix">On 6/4/2014 5:08 PM, Glenn Linderman
 wrote: 
 </div>
 <blockquote cite="mid:538FB501.2040601@g.nevcal.com" type="cite">
 <div class="moz-cite-prefix">On 6/4/2014 5:03 PM, Greg Ewing
 wrote: 
 </div>
 <blockquote cite="mid:538FB3C5.6010104@canterbury.ac.nz"
 type="cite">Serhiy Storchaka wrote: 
 <blockquote type="cite">html.HTMLParser, json.JSONDecoder,
 re.compile, tokenize.tokenize don't use iterators. They use
 indices, str.find and/or regular expressions. Common use case
 is quickly find substring starting from current position using
 str.find or re.search, process found token, advance position
 and repeat. 
 </blockquote>
 
 For that kind of thing, you don't need an actual character 
 index, just some way of referring to a place in a string. 
 </blockquote>
 
 I think you meant codepoint index, rather than character index. 
 
 <blockquote cite="mid:538FB3C5.6010104@canterbury.ac.nz"
 type="cite"> 
 Instead of an integer, str.find() etc. could return a 
 StringPosition, which would be an opaque reference to a 
 particular point in a particular string. You would be 
 able to pass StringPositions to indexing and slicing 
 operations to get fast indexing into the string that 
 they were derived from. 
 
 StringPositions could support the following operations: 
 
 Â Â StringPosition + int --> StringPosition 
 Â Â StringPosition - int --> StringPosition 
 Â Â StringPosition - StringPosition --> int 
 
 These would be computed by counting characters forwards 
 or backwards in the string, which would be slower than 
 int arithmetic but still faster than counting from the 
 beginning of the string every time. 
 
 In other contexts, StringPositions would coerce to ints 
 (maybe being an int subclass?) allowing them to be used 
 in any existing algorithm that slices strings using ints. 
 
 </blockquote>
 This starts to diverge from Python codepoint indexing via
 integers. Calculating or caching the codepoint index to byte
 offset as part of the str implementation stays compatible with
 Python. Introducing StringPosition makes a Python-like language.
 Or so it seems to me.</blockquote>
 
 Another thought is that StringPosition only works (quickly, at
 least), as you point out, for the string that they were derived
 from... so algorithms that walk two strings at a time cannot use the
 same StringPosition to do so... yep, this is quite divergent from
 CPython and Python. 
 </body>
</html>

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4