[Eric S. Raymond, in search of uniqueness] > ... > So, how about `time.time()` + hex(hash([]))? > > It looks to me like this will remain unique forever, because > another thread would have to create an object at the same memory > address during the same millisecond to collide. I'm afraid it's much more vulnerable than that: Python's thread granularity is at the bytecode level, not the statement level. It's very easy for thread A and B to see the same `time.time()` value, and after that arbitrarily long amounts of time may pass before they get around to doing the hash([]) business. When hash() completes, the storage for [] is immediately reclaimed under CPython, and it's again very easy for another thread to reuse the storage. I'm attaching an executable test case. It uses time.clock() because that has much higher resolution than time.time() on Windows (better than microsecond), but rounds it back to three decimal places to simulate millisecond resolution. The first three runs: saw 14600 unique in 30000 total saw 14597 unique in 30000 total saw 14645 unique in 30000 total So it sucks bigtime on my box. Better idea: borrow the _ThreadSafeCounter class from the tail end of the current CVS tempfile.py. The code works whether or not threads are available. Then `time.time()` + str(_counter.get_next()) is thread-safe. For that matter, plain old str(_counter.get_next()) will always be unique within a single run. However, in either case you're still not safe against concurrent *processes* generating the same cookies. tempfile.py has to worry about that too, of course, so the *best* idea is to call tempfile.mktemp() and leave it at that. It wastes some time checking the filesystem for a file of the same name (which, btw, goes much quicker on Linux than on Windows). >From time to time, somebody suggests adding a uuid generator to Python. Not a bad idea, but nobody wants to do all the x-platform work. like-capturing-snowflakes-ly y'rs - tim from threading import Thread import time N = 1000 NTHREADS = 30 class Worker(Thread): def __init__(self): Thread.__init__(self) def run(self): self.generated = [`round(time.clock(), 3)` + hex(hash([])) for i in range(N)] threads = [] for i in range(NTHREADS): threads.append(Worker()) for t in threads: t.start() d = {} total = 0 for t in threads: t.join() total += len(t.generated) for g in t.generated: d[g] = 1 print "saw", len(d), "unique in", total, "total"
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4