User Guide¶
Bitsets are immutable ordered sets drawn from a predefined finite sequence of hashable items.
They are implemented as pure Python integers representing the rank of the
set in colexicographical order (a.k.a bit strings, powers of two, binary
strings). Hence, they can be very space-efficient, e.g. if a large number of
subsets from a collection needs to be present in memory. Furthermore, they can
also be compared, intersected, etc. using normal bitwise operations of
integers (&, |, ^, ~
).
Installation¶
bitsets
is a pure-python package that runs under Python 3.7+.
To install it with using pip, run the following command:
$ pip install bitsets
There are no other dependencies. Optionally, you can install Graphviz and this Python wrapper to be able to draw Hasse diagrams of all possible sets from a bitsets’ object pool (see Advanced Usage).
Creation¶
Use the bitset()
function to create a class representing ordered
subsets from a fixed set of items (the domain):
>>> from bitsets import bitset
>>> PYTHONS = ('Chapman', 'Cleese', 'Gilliam', 'Idle', 'Jones', 'Palin')
>>> Pythons = bitset('Pythons', PYTHONS)
The domain needs to be a hashable duplicate-free sequence of at least one item
(usually, a tuple
).
The resulting class is an integer (int
) subclass, so its instances (being
integers) are immutable and hashable and thus in many ways similar to
Python’s built-in frozenset
.
>>> issubclass(Pythons, int)
True
Each domain item is mapped to a power of two (their rank in colexicographical order):
>>> Pythons.fromint(0)
Pythons()
>>> [Pythons.fromint(1), Pythons.fromint(2), Pythons.fromint(4)]
[Pythons(['Chapman']), Pythons(['Cleese']), Pythons(['Gilliam'])]
>>> Pythons.fromint(3)
Pythons(['Chapman', 'Cleese'])
>>> Pythons.fromint(2 ** 6 - 1)
Pythons(['Chapman', 'Cleese', 'Gilliam', 'Idle', 'Jones', 'Palin'])
>>> Pythons.fromint((1 << 0) + (1 << 5))
Pythons(['Chapman', 'Palin'])
The class provides access to the minimal (infimum
) and
maximal (supremum
) sets from its domain:
>>> Pythons.infimum
Pythons()
>>> Pythons.supremum
Pythons(['Chapman', 'Cleese', 'Gilliam', 'Idle', 'Jones', 'Palin'])
Basic usage¶
Bitsets can be created from members, bit strings, boolean sequences, and integers:
>>> Pythons(['Palin', 'Cleese'])
Pythons(['Cleese', 'Palin'])
>>> Pythons.frombits('101000')
Pythons(['Chapman', 'Gilliam'])
>>> Pythons.frombools([True, False, True, False, False, False])
Pythons(['Chapman', 'Gilliam'])
>>> Pythons.fromint(5)
Pythons(['Chapman', 'Gilliam'])
Members always occur in the definition order.
Bitsets cannot contain items other than those from their domain:
>>> Pythons(['Brian'])
Traceback (most recent call last):
...
KeyError: 'Brian'
>>> 'Spam' in Pythons(['Jones'])
Traceback (most recent call last):
...
KeyError: 'Spam'
Bitsets can be converted to members, bit strings, boolean sequences and integers:
>>> Pythons(['Chapman', 'Gilliam']).members()
('Chapman', 'Gilliam')
>>> Pythons(['Chapman', 'Gilliam']).bits()
'101000'
>>> Pythons(['Chapman', 'Gilliam']).bools()
(True, False, True, False, False, False)
>>> int(Pythons(['Chapman', 'Gilliam']))
5
Sorting¶
To facilitate sorting collections of bitsets, they have key methods for
different sort orders (shortlex()
, shortcolex()
,
longlex()
, and longcolex()
):
>>> Pythons(['Idle']).shortlex() < Pythons(['Palin']).shortlex()
True
These orderings are derived from the number of set members and the definition order of the items.
>>> Digits = bitset('Digits', '12345')
>>> onetwo = [d for d in Digits('12345').powerset() if d.count() in (1, 2)]
>>> shortlex = sorted(onetwo, key=lambda d: d.shortlex())
>>> [''.join(d) for d in shortlex]
['1', '2', '3', '4', '5',
'12', '13', '14', '15',
'23', '24', '25',
'34', '35',
'45']
>>> shortcolex = sorted(onetwo, key=lambda d: d.shortcolex())
>>> [''.join(d) for d in shortcolex]
['1', '2', '3', '4', '5',
'12',
'13', '23',
'14', '24', '34',
'15', '25', '35', '45']
Sorting a collection of bitsets without the use of a key function will order them in colexicographical order.
>>> [''.join(d) for d in sorted(onetwo)]
['1',
'2', '12',
'3', '13', '23',
'4', '14', '24', '34',
'5', '15', '25', '35', '45']
Powersets¶
Iterate over a bitsets’ powerset()
in short lexicographic order:
>>> for p in Pythons(['Palin', 'Idle']).powerset():
... print(p.members())
()
('Idle',)
('Palin',)
('Idle', 'Palin')
This is the same order as generated by itertools
recipes
powerset(iterable)
.
frozenset
compatibility¶
For convenience, bitsets provide the same methods as frozenset
(i.e. issubset()
, issuperset()
,
isdisjoint()
, intersection()
,
union()
, difference()
,
symmetric_difference()
, __len__()
,
__iter__()
, __bool__()
,
__contains__()
, and as a non-op copy()
).
>>> 'Cleese' in Pythons(['Idle'])
False
>>> 'Idle' in Pythons(['Idle'])
True
>>> Pythons(['Chapman', 'Idle']).intersection(Pythons(['Idle', 'Palin']))
Pythons(['Idle'])
Note, however that all the operators methods (+, -, &, |
etc.) retain
their integer semantics:
>>> print(Pythons(['Chapman', 'Idle']) - Pythons(['Idle']))
1
In tight loops it might be worth to use bitwise expressions (&, |, ^, ~
)
for set comparisons/operations instead of the frozenset
-compatible
methods:
>>> # is subset ?
>>> Pythons(['Idle']) & Pythons(['Chapman', 'Idle']) == Pythons(['Idle'])
True
Added functionality¶
Differing from frozenset
, you can also retrieve the
complement()
set of a bitset:
>>> Pythons(['Idle']).complement()
Pythons(['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin'])
>>> Pythons().complement().complement()
Pythons()
Test if a bitset is maximal (the supremum
):
>>> Pythons(['Idle']).all()
False
>>> Pythons(['Chapman', 'Cleese', 'Gilliam', 'Idle', 'Jones', 'Palin']).all()
True
Test if a bitset is non-minimal (the infimum
), same as
bool(bitset)
:
>>> Pythons(['Idle']).any()
True
>>> Pythons().any()
False