Peter Funk wrote: > > Greg Stein: > > I don't think we should have a two-byte magic value. Especially where > > those two bytes are printable, 7-bit ASCII. > [...] > > To ensure uniqueness, I think a four-byte magic should stay. > > Looking at /etc/magic I see many 16-bit magic numbers kept around > from the good old days. But you are right: Choosing a four-byte magic > value would make the chance of a clash with some other file format > much less likely. Just for quotes: the current /etc/magic I have on my Linux machine doesn't know anything about PYC or PYO files, so I don't really see much of a problem here -- noone seems to be interested in finding out the file type for these files anyway ;-) Also, I don't really get the 16-bit magic argument: we still have a 32-bit magic number -- one with a 16-bit fixed value and predefined ranges for the remaining 16 bits. This already is much better than what we have now w/r to making file(1) work on PYC files. > > I would recommend the approach of adding opcodes into the marshal format. > > Specifically, 'V' followed by a single byte. That can only occur at the > > beginning. If it is not present, then you know that you have an old > > marshal value. > > But this would not solve the problem with 8 byte versus 4 byte timestamps > in the header on 64-bit OSes. Trent Mick pointed this out. The switch to 8 byte timestamps is only needed when the current 4 bytes can no longer hold the timestamp value. That will happen in 2038... Note that import.c writes the timestamp in 4 bytes until it reaches an overflow situation. > I think, the situation we have now, is very unsatisfactory: I don't > see a reasonable solution, which allows us to keep the length of the > header before the marshal-block at a fixed length of 8 bytes together > with a frozen 4 byte magic number. Adding a version to the marshal format is a Good Thing -- independent of this discussion. > Moving the version number into the marshal doesn't help to resolve > this conflict. So either you have to accept a new magic on 64 bit > systems or you have to enlarge the header. No you don't... please read the code: marshal only writes 8 bytes in case 4 bytes aren't enough to hold the value. > To come up with a new proposal, the following questions should be answered: > 1. Is there really too much code out there, which depends on > the hardcoded assumption, that the marshal part of a .pyc file > starts at byte 8? I see no further evidence for or against this. > MAL pointed this out in > <http://www.python.org/pipermail/python-dev/2000-May/005756.html> I have several references in my tool collection, the import stuff uses it, old import hooks (remember ihooks ?) also do, etc. > 2. If we decide to enlarge the header, do we really need a new > header field defining the length of the header ? > This was proposed by Christian Tismer in > <http://www.python.org/pipermail/python-dev/2000-May/005792.html> In Py3K we can do this right (breaking things is allowed)... and I agree with Christian that a proper file format needs a header length field too. Basically, these values have to be present, IMHO: 1. Magic 2. Version 3. Length of Header 4. (Header Attribute)*n -- Start of Data --- Header Attribute can be pretty much anything -- timestamps, names of files or other entities, bit sizes, architecture flags, optimization settings, etc. > 3. The 'imp' module exposes somewhat the structure of an .pyc file > through the function 'get_magic()'. I proposed changing the signature of > 'imp.get_magic()' in an upward compatible way. I also proposed > adding a new function 'imp.get_version()'. What do you think about > this idea? imp.get_magic() would have to return the proposed 32-bit value ('PY' + version byte + option byte). I'd suggest adding additional functions which can read and write the header given a PYCHeader object which would hold the values version and options. > 4. Greg proposed prepending the version number to the marshal > format. If we do this, we definitely need a frozen way to find > out, where the marshalled code object actually starts. This has > also the disadvantage of making the task to come up with a /etc/magic > definition whichs displays the version number of a .pyc file slightly > harder. > > If we decide to move the version number into the marshal, if we can > also move the .py-timestamp there. This way the timestamp will be handled > in the same way as large integer literals. Quoting from the docs: > > """Caveat: On machines where C's long int type has more than 32 bits > (such as the DEC Alpha), it is possible to create plain Python > integers that are longer than 32 bits. Since the current marshal > module uses 32 bits to transfer plain Python integers, such values > are silently truncated. This particularly affects the use of very > long integer literals in Python modules -- these will be accepted > by the parser on such machines, but will be silently be truncated > when the module is read from the .pyc instead. > [...] > A solution would be to refuse such literals in the parser, since > they are inherently non-portable. Another solution would be to let > the marshal module raise an exception when an integer value would > be truncated. At least one of these solutions will be implemented > in a future version.""" > > Should this be 1.6? Changing the format of .pyc files over and over > again in the 1.x series doesn't look very attractive. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4