Showing content from http://mail.python.org/pipermail/python-dev/attachments/20140628/adca5370/attachment-0001.html below:
<p dir="ltr"><br>
On Jun 28, 2014 12:49 PM, "Ben Hoyt" <<a href="mailto:benhoyt@gmail.com">benhoyt@gmail.com</a>> wrote:<br>
><br>
> >> But the underlying system calls -- ``FindFirstFile`` /<br>
> >> ``FindNextFile`` on Windows and ``readdir`` on Linux and OS X --<br>
> ><br>
> > What about FreeBSD, OpenBSD, NetBSD, Solaris, etc. They don't provide readdir?<br>
><br>
> I guess it'd be better to say "Windows" and "Unix-based OSs"<br>
> throughout the PEP? Because all of these (including Mac OS X) are<br>
> Unix-based.</p>
<p dir="ltr">No, Just say POSIX.</p>
<p dir="ltr">><br>
> > It looks like the WIN32_FIND_DATA has a dwFileAttributes field. So we<br>
> > should mimic stat_result recent addition: the new<br>
> > stat_result.file_attributes field. Add DirEntry.file_attributes which<br>
> > would only be available on Windows.<br>
> ><br>
> > The Windows structure also contains<br>
> ><br>
> > Â FILETIME ftCreationTime;<br>
> > Â FILETIME ftLastAccessTime;<br>
> > Â FILETIME ftLastWriteTime;<br>
> > Â DWORD Â Â nFileSizeHigh;<br>
> > Â DWORD Â Â nFileSizeLow;<br>
> ><br>
> > It would be nice to expose them as well. I'm  no more surprised that<br>
> > the exact API is different depending on the OS for functions of the os<br>
> > module.<br>
><br>
> I think you've misunderstood how DirEntry.lstat() works on Windows --<br>
> it's basically a no-op, as Windows returns the full stat information<br>
> with the original FindFirst/FindNext OS calls. This is fairly explict<br>
> in the PEP, but I'm sure I could make it clearer:<br>
><br>
> Â Â DirEntry.lstat(): "like os.lstat(), but requires no system calls on Windows<br>
><br>
> So you can already get the dwFileAttributes for free by saying<br>
> entry.lstat().st_file_attributes. You can also get all the other<br>
> fields you mentioned for free via .lstat() with no additional OS calls<br>
> on Windows, for example: entry.lstat().st_size.<br>
><br>
> Feel free to suggest changes to the PEP or scandir docs if this isn't<br>
> clear. Note that is_dir()/is_file()/is_symlink() are free on all<br>
> systems, but .lstat() is only free on Windows.<br>
><br>
> > Does your implementation uses a free list to avoid the cost of memory<br>
> > allocation? A short free list of 10 or maybe just 1 may help. The free<br>
> > list may be stored directly in the generator object.<br>
><br>
> No, it doesn't. I might add this to the PEP under "possible<br>
> improvements". However, I think the speed increase by removing the<br>
> extra OS call and/or disk seek is going to be way more than memory<br>
> allocation improvements, so I'm not sure this would be worth it.<br>
><br>
> > Does it support also bytes filenames on UNIX?<br>
><br>
> > Python now supports undecodable filenames thanks to the PEP 383<br>
> > (surrogateescape). I prefer to use the same type for filenames on<br>
> > Linux and Windows, so Unicode is better. But some users might prefer<br>
> > bytes for other reasons.<br>
><br>
> I forget exactly now what my scandir module does, but for os.scandir()<br>
> I think this should behave exactly like os.listdir() does for<br>
> Unicode/bytes filenames.<br>
><br>
> > Crazy idea: would it be possible to "convert" a DirEntry object to a<br>
> > pathlib.Path object without losing the cache? I guess that<br>
> > pathlib.Path expects a full  stat_result object.<br>
><br>
> The main problem is that pathlib.Path objects explicitly don't cache<br>
> stat info (and Guido doesn't want them to, for good reason I think).<br>
> There's a thread on python-dev about this earlier. I'll add it to a<br>
> "Rejected ideas" section.<br>
><br>
> > I don't understand how you can build a full lstat() result without<br>
> > really calling stat. I see that WIN32_FIND_DATA contains the size, but<br>
> > here you call lstat().<br>
><br>
> See above.<br>
><br>
> > Do you plan to continue to maintain your module for Python < 3.5, but<br>
> > upgrade your module for the final PEP?<br>
><br>
> Yes, I intend to maintain the standalone scandir module for 2.6 <=<br>
> Python < 3.5, at least for a good while. For integration into the<br>
> Python 3.5 stdlib, the implementation will be integrated into<br>
> posixmodule.c, of course.<br>
><br>
> >> Should there be a way to access the full path?<br>
> >> ----------------------------------------------<br>
> >><br>
> >> Should ``DirEntry``'s have a way to get the full path without using<br>
> >> ``os.path.join(path, <a href="http://entry.name">entry.name</a>)``? This is a pretty common pattern,<br>
> >> and it may be useful to add pathlib-like ``str(entry)`` functionality.<br>
> >> This functionality has also been requested in `issue 13`_ on GitHub.<br>
> >><br>
> >> .. _`issue 13`: <a href="https://github.com/benhoyt/scandir/issues/13">https://github.com/benhoyt/scandir/issues/13</a><br>
> ><br>
> > I think that it would be very convinient to store the directory name<br>
> > in the DirEntry. It should be light, it's just a reference.<br>
> ><br>
> > And provide a fullname() name which would just return<br>
> > os.path.join(path, <a href="http://entry.name">entry.name</a>) without trying to resolve path to get<br>
> > an absolute path.<br>
><br>
> Yeah, fair suggestion. I'm still slightly on the fence about this, but<br>
> I think an explicit fullname() is a good suggestion. Ideally I think<br>
> it'd be better to mimic pathlib.Path.__str__() which is kind of the<br>
> equivalent of fullname(). But how does pathlib deal with unicode/bytes<br>
> issues if it's the str function which has to return a str object? Or<br>
> at least, it'd be very weird if __str__() returned bytes. But I think<br>
> it'd need to if you passed bytes into scandir(). Do others have<br>
> thoughts?<br>
><br>
> > Would it be hard to implement the wildcard feature on UNIX to compare<br>
> > performances of scandir('*.jpg') with and without the wildcard built<br>
> > in os.scandir?<br>
><br>
> It's a good idea, the problem with this is that the Windows wildcard<br>
> implementation has a bunch of crazy edge cases where *.ext will catch<br>
> more things than just a simple regex/glob. This was discussed on<br>
> python-dev or python-ideas previously, so I'll dig it up and add to a<br>
> Rejected Ideas section. In any case, this could be added later if<br>
> there's a way to iron out the Windows quirks.<br>
><br>
> -Ben<br>
> _______________________________________________<br>
> Python-Dev mailing list<br>
> <a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/python-dev">https://mail.python.org/mailman/listinfo/python-dev</a><br>
> Unsubscribe: <a href="https://mail.python.org/mailman/options/python-dev/greg%40krypto.org">https://mail.python.org/mailman/options/python-dev/greg%40krypto.org</a><br>
</p>
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4