>> Still, Unicode or not, the notion that XML-RPC is a data >> serialization mechanism instead of a compound data markup language >> means you don't need to provide hooks for processing each element, so >> full-blown XML parsers tend to be overkill as py-xmlrpc demonstrates. Paul> I don't see how that follows. py-xmlrpc needs to handle <struct> Paul> different than <array> so it needs to have a "hook" for each of Paul> those element types. Having a fixed list of hooks or an extensible Paul> array of them should not be much different from a performance Paul> point of view. Sure, <struct> and <array> mean different things, but <struct> will always mean the same thing in an XML-RPC context. There's no need to provide any hooks. Once you've successfully parsed a <struct> you get a Python dictionary. As far as I can tell sgmlop is always going to be slower than py-xmlrpc because it must callback to an Unmarshaller instance for each tag. The only option currently available is the Unmarshaller class written in Python. Pythonware has a FastParser/FastUnmarshaller pair available now which I don't have access to. Perhaps it exhibits encode/decode speeds similar to py-xmlrpc. You'll have to ask Fredrik. Py-xmlrpc was written with the knowledge that intermediate results aren't useful and that as you put it, it has a fixed vocabulary. Why structure a parser to accommodate situations that aren't needed? Paul> Yes, an incomplete XML parser could be faster if it ignores Paul> Unicode, ignores character references, and does not do all of the Paul> error checking required by the spec. I'm not sure if this would Paul> really improve performance anyhow. Does py-xmlrpc have a ways to go? Sure. It's still pretty new software, so give it time. You seem to be dismissing it completely because it's not as mature as, say, Expat. I doubt it will lose a factor of 8 in encoding speed or a factor of 24 in decoding speed (the current speed advantages I measure over xmlrpclib 1.0b4 using sgmlop) when those things are all added. I'm not sure all those things will ever be needed, but you're welcome to think they will. Paul> py-xmlrpc is probably faster because it doesn't call out to Python Paul> code until the entire message has been parsed. xmlrpclib on the Paul> other hand, is entirely written in Python. Is there a Python Paul> XML-RPC implementation that uses no Python code but does use a Paul> true XML parser? That's precisely why py-xmlrpc is faster. Should it behave some other way? I don't think there is another XML-RPC parser out there that is available from Python but that doesn't use Python. >> ... No matter how hard Shilad finds it to add Unicode support to his >> package, it's still likely to be miles ahead of other XML parsers. Paul> I think you are exaggerating the benefit of having a fixed Paul> vocabulary. There is hardly any performance boost possible based Paul> on that one detail. I don't understand see how you can't make that connection. XML-RPC has a fixed vocabulary and never needs to look at intermediate results. It sounds to me like all you have is a hammer so everything looks like a nail. There are places for general-purpose XML parsers and places for special-purpose XML parsers. In this particular context I only care about how fast I can push objects between a client and server using XML-RPC. I apologize if the subject seems more general than I intended. My only intention was to compare the data serialization performance of various tools. I didn't include "XML-RPC" in the subject of this thread because I tossed in marshal and cPickle results as well, simply for comparison. Skip
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4