A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2010-June/101202.html below:

[Python-Dev] Mailbox module - timings and functionality changes

[Python-Dev] Mailbox module - timings and functionality changesA.M. Kuchling amk at amk.ca
Tue Jun 29 18:52:28 CEST 2010
On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote:
> I will leave the profiler output to speak for itself, since I can find
> nothing much to say about it except that there's a hell of a lot of
> decoding going on inside mailbox.iterkeys().

The problem is actually in _generate_toc(), which is reading through
the entire file to figure out where all the 'From' lines that start
messages are located.  TextIOWrapper()'s tell() method seems to be
very slow, so one help is to only call tell() when necessary; patch:

-> svn diff Lib/
Index: Lib/mailbox.py
===================================================================
--- Lib/mailbox.py	(revision 82346)
+++ Lib/mailbox.py	(working copy)
@@ -775,13 +775,14 @@
         starts, stops = [], []
         self._file.seek(0)
         while True:
-            line_pos = self._file.tell()
             line = self._file.readline()
             if line.startswith('From '):
+                line_pos = self._file.tell()
                 if len(stops) < len(starts):
                     stops.append(line_pos - len(os.linesep))
                 starts.append(line_pos)
             elif not line:
+                line_pos = self._file.tell()
                 stops.append(line_pos)
                 break
         self._toc = dict(enumerate(zip(starts, stops)))

But should mailboxes really be opened in a UTF-8 encoding, or should
they be treated as 7-bit text?  I'll have to think about this.

--amk
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4