I was recently reading a book on Java and read through a chapter on how to use bit fields. Bit Fields are a data structure that I never really understood completely (along with most of the bitwise operators), so I figured I would take some time to look at Bit Fields in detail.
A data structure that consists of one or many memory locations (bits). Each of these bits has a unique meaning defined by the programmer1. It is common practice to use an unsigned integer data type of a specified length for a bit field. For example, in C you can define a bit field as unsigned int bitField : 2. This bit field consists of two bits which can be turned on or off - each of which has a unique meaning defined by the programmer.
I decided to go back to C for my bit field implementation since the data structure is often used in situations where low memory consumption is imperative. While I am no master at C, I really enjoy playing around with the language and hope to understand it to a greater capacity some day.
I started my implementation by defining some bit masks along with a type definition representing a user. The last field in the User type definition - statusField - is a 3-bit field.
The preprocessor definitions for this code are three bit masks which can be applied to the bit field. Every time a macro such as VALIDATED is used in the code, its occurrence is replaced with the definitions replacement text (ex. VALIDATED is replaced with 01)2.
A bit mask is a series of bits used to perform operations on another series of bits. A mask is commonly used to check for the existence of certain bits or remove/add certain bits to a value3, 4. For example, let’s say we have a series of bits 010. If we wanted to toggle on the first bit in the previous sequence, we can use the bit mask 001 and perform a bitwise OR operation. The result of 010 | 001 is 011, so the first bit was successfully toggled on by the bit mask.
The typedef struct code defines a new data type that has multiple fields - a users first and last name along with a bit field containing their status5. The bit field represents a users privileges/status on an online website.
Next I wrote some code to declare a new user type and set its bit field to 000.
Now comes the interesting part. I set up some functions to alter bits in the bit field using bit masks.
To turn on bits in a bit field, the bitwise OR operation is performed between the bit field and the mask. For example, 000 | 010 results in the middle bit being turned on - 010.
To turn off bits in the bit field, the bitwise AND and NOT operators are used. For example, 011 & ~010 turns off the middle bit, resulting in 001.
To check if a bit is turned on in a bit field, a simple bitwise AND is used between the bit field and the mask. If the value resulting from the bitwise AND is greater than 0, the bit is turned on in the field. Otherwise, its turned off. For example, 011 & 010 checks if the middle bit is turned on. Since the result is 010 or 2, we know that the bit is turned on in the bit field.
The bit masks are the preprocessor macros VALIDATED, SUBSCRIBED, and ADMIN. They are used in the following code to turn on bits, turn off bits, and check for existence of bits in the bit field.
The full code for the C implementation can be found here.
Bit fields can also be used in Java. The following code for a User class defines static final variables for the bit masks. Note that in Java all integers are signed and 32 bits long, so you have 31 bits to play with in a bit field data structure.
The following code performs operations on the user class, similar to the C implementation.
The full code for the User class and Main class utilizing bit fields is on GitHub.
While you can use bit fields in Java like the previous implementation, using primitive int variables for bit fields goes against some of Java’s core principles. Using int primitives for bit fields is not type safe - which is often a crucial requirement for a Java data structure6. Luckily there is a type safe approach for bit fields in Java. This approach uses the EnumSet collection to store the fields. The fields themselves are represented as enums. This approach is not only type safe and more readable, but also has the same performance as the non-safe integer approach.