Select a random item from a listtupledata stucture in Python

One of the most common tasks that requires random action is selecting one item from a group, be it a character from a string, unicode, or buffer, a byte from a bytearray, or an item from a list, tuple, set, or xrange. It’s also common to want a sample of more than one item.

Don’t do this when randomly selecting an item

A naive approach to these tasks involves something like the following; to select a single item, you would use randrange (or randint) from the random module, which generates a pseudo-random integer from the range indicated by its arguments:

import random
 
items = ['here', 'are', 'some', 'strings', 'of',
         'which', 'we', 'will', 'select', 'one']
 
rand_item = items[random.randrange(len(items))]

An equally naive way to select multiple items might use random.randrange to generate indexes inside a list comprehension, as in:

rand_items = [items[random.randrange(len(items))]
              for item in range(4)]

These work, but as you should expect if you’ve been writing Python for any length of time, there’s a built-in way of doing it that is briefer and more readable.

Do this instead, when selecting an item

The pythonic way to select a single item from a Python sequence type — that’s any of strunicodelisttuplebytearraybufferxrange — is to use random.choice. For example, the last line of our single-item selection would be:

rand_item = random.choice(items)

Much simpler, isn’t it? There’s an equally simple way to select n items from the sequence:

rand_items = random.sample(items, n)

Randomly selecting from a set

sets are not indexable, meaning set([1, 2, 3])[0] produces an error. Therefore random.choice does not support sets, however random.sample does.

For example:

>>> from random import choice, sample
>>>
>>> # INVALID: set([1, 2, 3])[0]
>>> choice(set([1, 2, 3, 4, 5]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<python-dist>/random.py", line 275, in choice
    return seq[int(self.random() * len(seq))]  # raises IndexError if seq is empty
TypeError: 'set' object does not support indexing

There are several ways to get around this, 2 of which are to convert the set to a list first, and to use random.sample which does support sets.

Example:

>>> from random import choice, sample
>>>
>>> # Convert the set to a list
>>> choice(list(set([1, 2, 3])))
1
>>>
>>> # random.sample(), selecting 1 random element
>>> sample(set([1, 2, 3]), 1)
[1]
>>> sample(set([1, 2, 3]), 1)[0]
3

Duplicate Items

If the sequence contains duplicate values, each one is an independent candidate for selection. To avoid duplicates, one method would be to convert the list into a set, and back into a list. For example:

>>> my_list = [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
>>> my_set = set(my_list)
>>> my_list = list(my_set) # No duplicates
>>> my_list
[1, 2, 3, 4, 5]
>>> my_elem = random.choice(my_list)
>>> my_elem
2
>>> another_elem = random.choice(list(set([1, 1, 1])))

Read Also: How to Generate Random Sentences in Python?

Leave a Reply

Your email address will not be published. Required fields are marked *