Bitfields

I recently had need to interpret bitfields with Python.
I’m quite happy with the 3 lines of code that I came up with so I thought I’d share them in case they are of use to anyone else.

Bitfields are basically a binary number where each bit is assigned a meaning which can either have a value of True ‘1’ or False ‘0’.
Usually they are interpreted using bit shifting and bitwise AND operations but this seemed to be quite involved to get the data into a usable form so I found another way.

Consider the pwdProperties attribute from Active Directory (http://msdn.microsoft.com/en-us/library/ms679431(v=vs.85).aspx) which contains several settings for the account as a bitfield which can be retrieved using an LDAP query.

Each of the bits of this attribute mean the following:
1st bit = DOMAIN_PASSWORD_COMPLEX
2nd bit = DOMAIN_PASSWORD_NO_ANON_CHANGE
3rd bit = DOMAIN_PASSWORD_NO_CLEAR_CHANGE
4th bit = DOMAIN_LOCKOUT_ADMINS
5th bit = DOMAIN_PASSWORD_STORE_CLEARTEXT
6th bit = DOMAIN_REFUSE_PASSWORD_CHANGE

So if the pwdProperties attribute has a value of 17 in decimal, which equals 010001 in binary, the 1st and 5th bits (from the right) are set to 1 indicating that the domain requires complex passwords and stores passwords in cleartext.

Using python-ldap this attribute is returned in a dictionary as a decimal number represented as a string within a list, i.e.

attrs = {'pwdProperties': ['17']}

So the first step is to extract the string of the number and convert it to an integer:

pwd_properties = int(attrs['pwdProperties'][0])

Next the decimal number is converted to a string representation of the binary number with left 0 padding to the correct length:

pwd_properties = format(pwd_properties, "06b")

Then the binary number string is split into a list:

pwd_properties = list(pwd_properties)

For my purposes I needed the bitfields to be represented as a boolean. To do this a the string replace() method is used to replace instances of ‘0’ with an empty string and then the bool() function is used to convert the result to either True or False while iterating over the list. (Note when dealing with strings an empty string is False and everything else is True).

bitfield_values = [bool(w.replace('0', '')) for w in pwd_properties]

Next a list containing the meaning of each bit is defined (make sure you have them in the correct order to match the bits) :

bitfield_keys = ['refuse_password_change', 'password_store_cleartext', 'lockout_admins', 'password_no_clear_change', 'password_no_anon_change', 'password_complex']

The two lists can then be formed into a list of tuples using zip() which is then used to create a dictionary using dict() :

pwd_properties = dict(zip(bitfield_keys, bitfield_values))

Finally this can all be condensed into :

bitfield_keys = ['refuse_password_change', 'password_store_cleartext', 'lockout_admins', 'password_no_clear_change', 'password_no_anon_change', 'password_complex']
bitfield_values = [bool(w.replace('0', '')) for w in list(format(int(attrs['pwdProperties'][0]), '06b'))]
pwd_properties = dict(zip(bitfield_keys, bitfield_values))

Resulting in a dictionary like this:

{'password_store_cleartext': True,
'password_no_anon_change': False,
'lockout_admins': False,
'refuse_password_change': False,
'password_no_clear_change': False,
'password_complex': True}

A limitation of this method is that it is not easy to go from the resulting dictionary back to the bitfield because a dictionary in Python is unordered. This can probably be overcome by using an ordered dictionary from the collections module. However for my current purpose there is no advantage to implementing this.

EDIT:
I have been mulling this over and come up with the following line to convert the dictionary back to a binary number :

int("{refuse_password_change}{password_store_cleartext}{lockout_admins}{password_no_clear_change}{password_no_anon_change}{password_complex}".format(**pwd_properties).replace('True', '1').replace('False', '0'), 2)

This is probably horribly inefficient due to the string replacement, but it works.
It takes advantage of unpacking and referencing keyword arguments to form a string with the values in the correct order, then replaces the strings ‘True’ and ‘False’ with ‘1’ and ‘0’ respectively before using the int() function to convert the string base 2 number (i.e. binary) to a decimal number.

It might be more efficient to avoid string replacement like this :

int("{refuse_password_change}{password_store_cleartext}{lockout_admins}{password_no_clear_change}{password_no_anon_change}{password_complex}".format(**dict(zip(pwd_properties.keys(), ['1' if pwd_properties[key] == True else '0' for key in pwd_properties.keys()]))), 2)

This recreates the dictionary with ‘1’ and ‘0’ by testing each key for True. Then takes advantage of unpacking and keyword arguments to get the bits in the correct order, before converting to a decimal number using int().

At some point I’ll time these two methods to find which is more efficient and update this post.


As always, if you have any comments or suggestions please feel free to get in touch.