On 17.11.15 05:00, Guido van Rossum wrote: > If you free the memory used for the source buffer before starting code > generation you should be good. Thank you. The buffer is freed just after the end of generating AST. > On Mon, Nov 16, 2015 at 5:53 PM, Serhiy Storchaka <storchaka at gmail.com> wrote: >> I'm working on rewriting Python tokenizer (in particular the part that reads >> and decodes Python source file). The code is complicated. For now there are >> such cases: >> >> * Reading from the string in memory. >> * Interactive reading from the file. >> * Reading from the file: >> - Raw reading ignoring encoding in parser generator. >> - Raw reading UTF-8 encoded file. >> - Reading and recoding to UTF-8. >> >> The file is read by the line. It makes hard to check correctness of the >> first line if the encoding is specified in the second line. And it makes >> very hard problems with null bytes and with desynchronizing buffered C and >> Python files. All this problems can be easily solved if read all Python >> source file in memory and then parse it as string. This would allow to drop >> a large complex and buggy part of code. >> >> Are there disadvantages in this solution? As for memory consumption, the >> source text itself will consume only small part of the memory consumed by >> AST tree and other structures. As for performance, reading and decoding all >> file can be faster then by the line. >> >> [1] http://bugs.python.org/issue25643 >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > >
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4