A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://stackoverflow.com/questions/124671/picking-a-random-element-from-a-set below:

java - Picking a random element from a set

Unfortunately, this cannot be done efficiently (better than O(n)) in any of the Standard Library set containers.

This is odd, since it is very easy to add a randomized pick function to hash sets as well as binary sets. In a not to sparse hash set, you can try random entries, until you get a hit. For a binary tree, you can choose randomly between the left or right subtree, with a maximum of O(log2) steps. I've implemented a demo of the later below:

import random

class Node:
    def __init__(self, object):
        self.object = object
        self.value = hash(object)
        self.size = 1
        self.a = self.b = None

class RandomSet:
    def __init__(self):
        self.top = None

    def add(self, object):
        """ Add any hashable object to the set.
            Notice: In this simple implementation you shouldn't add two
                    identical items. """
        new = Node(object)
        if not self.top: self.top = new
        else: self._recursiveAdd(self.top, new)
    def _recursiveAdd(self, top, new):
        top.size += 1
        if new.value < top.value:
            if not top.a: top.a = new
            else: self._recursiveAdd(top.a, new)
        else:
            if not top.b: top.b = new
            else: self._recursiveAdd(top.b, new)

    def pickRandom(self):
        """ Pick a random item in O(log2) time.
            Does a maximum of O(log2) calls to random as well. """
        return self._recursivePickRandom(self.top)
    def _recursivePickRandom(self, top):
        r = random.randrange(top.size)
        if r == 0: return top.object
        elif top.a and r <= top.a.size: return self._recursivePickRandom(top.a)
        return self._recursivePickRandom(top.b)

if __name__ == '__main__':
    s = RandomSet()
    for i in [5,3,7,1,4,6,9,2,8,0]:
        s.add(i)

    dists = [0]*10
    for i in xrange(10000):
        dists[s.pickRandom()] += 1
    print dists

I got [995, 975, 971, 995, 1057, 1004, 966, 1052, 984, 1001] as output, so the distribution seams good.

I've struggled with the same problem for myself, and I haven't yet decided weather the performance gain of this more efficient pick is worth the overhead of using a python based collection. I could of course refine it and translate it to C, but that is too much work for me today :)


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4