[Jeremy, please skip forward to where it says "Stevens" or "Jeremy" and comment.] > Glad to hear that _fileobject works well. I didn't say that. I was hoping it does though. :-) > Is there any benefit to having the file code in C? Some operations (notably pickle.dump() and .load()) only work with real files. Other operations (e.g. printing a large list or dict) can be faster to real files because they don't have to build the str() or repr() of the whole thingk as a string first. > I bet the python code isn't that much slower. You're on. Write a benchmark. I notice that httplib uses makefile(), and often reads very small chunks. > It does seem a shame to have to switch > between the two. Maybe one solution is that a makefile should set > blocking back on if a timeout exists? That's not very nice. It could raise an exception. But you could set the timeout on the socket *after* calling makefile(), and then you'd be hosed. But if we always use the Python makefile(), the problem is solved. > That would solve the problem > and is a consistent change since it only checks timeout behavior (= > 2 lines of code). I'd vote for using the python fileobject for > both. Some profiling of the two would be nice if you have the time. I don't, maybe you do? :-) > > I'm also thinking of implementing the socket wrapper (which > > is currently a Python class that has a "real" socket object as an > > instance variable) as a subclass of the real socket class instead. > Er, what's the difference between that and _socketobject in > socket.py? Why not just use the python bit consistently? Making it a subclass should be faster. But I don't know yet if it can work -- I probably have to change the constructor at the C level to be able to support dup() (which is the other reason for the Python wrapper). > Is it really so painful for apps to keep track of all their sockets > and then do something like: > > for sock in sock_list: > sock.settimeout (blah) > > Why keep track of them in the socket module, unless there's already code > for this. Steve Holden already answered this one. Also, you don't have to keep track of all sockets -- you just have to apply the timeout if one is set in a global variable. > Well, we agreed to do some clean-up in a separate patch but you seem > anxious to get it in there :) I am relaxing now, waiting for you to pick up again. > Well, this was the initial reason to use the selectmodule.c code. > There's got to be a way to share code between the two for bare access > to select, since someone else might want to use such functionality one > day (and this has set the precendent). Why not make a small change to > selectmodule.c that opens up the code in a C API or some sort? And > then have select_select use that function internally. If two modules are both shared libraries, it's really painful to share entry points (see the interface between _ssl.c and socketmodule.c for an example -- it's written down in a large comment block in socketmodule.h). I really think one should be able to use select() in more than one file. At least it works on Windows. ;-) > > > > - I'm not sure that the handling of timeout errors in accept(), > > > > connect() and connect_ex() is 100% correct (this code sniffs the > > > > error code and decides whether to retry or not). > > This is how the original timeoutsocket.py did it and it seems > to be the way to do blocking connects. I understand all that. My comment is that your patch changed the control flow in non-blocking mode too. I think I accidentally fixed it by setting sock_timeout to -1.0 in setblocking(). But I'm not 100% sure so I'd like this aspect to be unit-tested thoroughly. > Ok.. Should I merge the test_timeout.py and test_socket.py as well > then? No, you can create as many (or as few) unit test files as you need. > A little off-topic, while I was thinking of restructuring these > tests, I was wondering what might be the best way to structure a unit > test where things have to work in seperate processes/threads. What > I'd really like to do is: > > * Have the setup function set up server and client sockets > * Have the tear-down function close them > * Have some synchronization function or simple message (this is > socket specific) > * Then have a test that has access to both threads and can insert > call-backs to run in each thread. > > This seems tricky with the current unit-test frame work. The way I'd > do it is using threading.Event () and have the thing block until the > server-side and client-side respectively submit their test callbacks. > But it might be nice to have a general class that can be added to the > testing framework. Thoughts? If I were you I'd worry about getting it right once first. Then we can see if there's room for generalization. (You might want to try to convert test_socketserver.py to your proposed framework to see how well it works.) > > Can you explain why on Windows you say that the socket is connected > > when connect() returns a WSAEINVAL error? > > This is what timeoutsocket.py used as the unix equivalent error > codes, and since I'm not set up to test windows and since it was > working code, I took their word for it. Well, but WSAEINVAL can also be returned for other conditions. See http://msdn.microsoft.com/library/en-us/winsock/wsapiref_8m7m.asp it seems that the *second* time you call connect() WSAEINVAL can only mean that you're already connected. But if this socket is in a different state, and the connect() is simply not appropriate, I don't like the fact that connect() would simply return "success" rather than reporting an error. E.g. I could do this: s = socket() s.settimeout(100) s.connect((host, port)) . . . # By mistake: s.connect((otherhost, otherport)) I want the latter connect() to fail, but I think your code will make it succeed. > > Also, your code enters this block even in non-blocking mode, if a > > timeout was also set. (Fortunately I fixed this for you by setting > > the timeout to -1.0 in setblocking(). Unfortunately there's still a > > later test for !sock_blocking in the same block that cannot succeed > > any more because of that.) > > This confusion is arising because of the restructuring of the code. > Erm, this check applies if we have a timeout but are in non-blocking > mode. Perhaps you changed this? To make it clearer, originally before > the v2 of the patch, the socket was always in non-blocking mode, so > it was necessary to check whether we were examining error code with > non-blocking in mind, or whether we were checking for possible timeout > behavior. Since we've changed this, it now checks if non-blocking has > been set while a timeout has been set. Seems valid to me... But I changed that again: setblocking() now always disables the timeout. Read the new source in CVS. > > The same concerns apply to connect_ex() and accept(), which have very > > similar logic. > > > > I believe it is possible on some Unix variants (maybe on Linux) that > > when select indicates that a socket is ready, if the socket is in > > nonblocking mode, the call will return an error because some kernel > > resource is unavailable. This suggests that you may have to keep the > > socket in blocking mode except when you have to do a connect() or > > accept() (for which you can't do a select without setting the socket > > in nonblocking mode first). > > Not sure about this. Checking the man pages, the error codes seem > to be the thing to check to determine what the behavior is. Perhaps > you could clarify? When a timeout is set, the socket file descriptor is always in nonblocking mode. Take sock_recv() for example. It calls internal_select(), and if that returns >= 1, it calls recv(). But according to the Stevens books, it is still possible (under heavy load) that the recv() returns an EWOULDBLOCK error. (We ran into this while debugging a high-performance application based on Spread. The select() succeeded, but the recv() failed, because the socket was in nonblocking mode. Well, I'm *almost* sure that this was the case -- Jeremy Hylton should know the details.) > > Looking at connect_ex, it seems to be missing the "res = errno" bit > > in the case where it says "we're already connected". It used to > > return errno here, now it will return -1. Maybe the conex_finally > > label should be moved up to before the "if (res != 0) {" line? > > Ah yes. I didn't look closely enough at the windows bit. On linux > it isn't necessary. Let's move it up. OK, done. > > OTOH, a timeout of 0 behaves very similar to nonblocking mode -- > > similar enough that a program that uses setblocking(0) would probably > > also work when using settimeout(0). I kind of like the idea of having > > only a single internal flag value, sock_timeout, rather than two > > (sock_timeout and sock_blocking). > > But one throws an exception and one doesn't. What do you mean? In nonblocking mode you get an exception when the socket isn't ready too. > It seems to me that > setting a timeout of 0 is sort of an error, if anything. It'll be an > easy way to do a superficial test of the functionality in the regr > test. OK, we don't seem to be able to agree on this. I'll let your wisdom prevail. > > The tests don't look very systematic. There are many cases (default > > bufsize, unbuffered, bufsize=1, large bufsize; combine with read(), > > readline(), read a line larger than the buffer size, etc.). We need a > > more systematic approach to unit testing here. > > Ok, so to recap which tests we want: > > * Default read() > * Read with size given > * unbuffered read > * large buffer size > * Mix read and realine > * Do a realine > * Do a readline larger than buffer size. > > Any others in the list? Check the socket.py source code and make sure that every code path through every method is taken at least once. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4