Greg Stein: > I don't think we should have a two-byte magic value. Especially where > those two bytes are printable, 7-bit ASCII. [...] > To ensure uniqueness, I think a four-byte magic should stay. Looking at /etc/magic I see many 16-bit magic numbers kept around from the good old days. But you are right: Choosing a four-byte magic value would make the chance of a clash with some other file format much less likely. > I would recommend the approach of adding opcodes into the marshal format. > Specifically, 'V' followed by a single byte. That can only occur at the > beginning. If it is not present, then you know that you have an old > marshal value. But this would not solve the problem with 8 byte versus 4 byte timestamps in the header on 64-bit OSes. Trent Mick pointed this out. I think, the situation we have now, is very unsatisfactory: I don't see a reasonable solution, which allows us to keep the length of the header before the marshal-block at a fixed length of 8 bytes together with a frozen 4 byte magic number. Moving the version number into the marshal doesn't help to resolve this conflict. So either you have to accept a new magic on 64 bit systems or you have to enlarge the header. To come up with a new proposal, the following questions should be answered: 1. Is there really too much code out there, which depends on the hardcoded assumption, that the marshal part of a .pyc file starts at byte 8? I see no further evidence for or against this. MAL pointed this out in <http://www.python.org/pipermail/python-dev/2000-May/005756.html> 2. If we decide to enlarge the header, do we really need a new header field defining the length of the header ? This was proposed by Christian Tismer in <http://www.python.org/pipermail/python-dev/2000-May/005792.html> 3. The 'imp' module exposes somewhat the structure of an .pyc file through the function 'get_magic()'. I proposed changing the signature of 'imp.get_magic()' in an upward compatible way. I also proposed adding a new function 'imp.get_version()'. What do you think about this idea? 4. Greg proposed prepending the version number to the marshal format. If we do this, we definitely need a frozen way to find out, where the marshalled code object actually starts. This has also the disadvantage of making the task to come up with a /etc/magic definition whichs displays the version number of a .pyc file slightly harder. If we decide to move the version number into the marshal, if we can also move the .py-timestamp there. This way the timestamp will be handled in the same way as large integer literals. Quoting from the docs: """Caveat: On machines where C's long int type has more than 32 bits (such as the DEC Alpha), it is possible to create plain Python integers that are longer than 32 bits. Since the current marshal module uses 32 bits to transfer plain Python integers, such values are silently truncated. This particularly affects the use of very long integer literals in Python modules -- these will be accepted by the parser on such machines, but will be silently be truncated when the module is read from the .pyc instead. [...] A solution would be to refuse such literals in the parser, since they are inherently non-portable. Another solution would be to let the marshal module raise an exception when an integer value would be truncated. At least one of these solutions will be implemented in a future version.""" Should this be 1.6? Changing the format of .pyc files over and over again in the 1.x series doesn't look very attractive. Regards, Peter
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4