DISCOVERY

June 23rd, 2018

Working with Bit Fields

Data Structures

C

Java

I was recently reading a book on Java and read through a chapter on how to use bit fields. Bit Fields are a data structure that I never really understood completely (along with most of the bitwise operators), so I figured I would take some time to look at Bit Fields in detail.

Bit Field

A data structure that consists of one or many memory locations (bits). Each of these bits has a unique meaning defined by the programmer1. It is common practice to use an unsigned integer data type of a specified length for a bit field. For example, in C you can define a bit field as unsigned int bitField : 2. This bit field consists of two bits which can be turned on or off - each of which has a unique meaning defined by the programmer.

I decided to go back to C for my bit field implementation since the data structure is often used in situations where low memory consumption is imperative. While I am no master at C, I really enjoy playing around with the language and hope to understand it to a greater capacity some day.

I started my implementation by defining some bit masks along with a type definition representing a user. The last field in the User type definition - statusField - is a 3-bit field.

#define VALIDATED 01 #define SUBSCRIBED 02 #define ADMIN 04 typedef struct { char first[31]; char last[31]; unsigned int statusField : 3; } User;

The preprocessor definitions for this code are three bit masks which can be applied to the bit field. Every time a macro such as VALIDATED is used in the code, its occurrence is replaced with the definitions replacement text (ex. VALIDATED is replaced with 01)2.

Bit Mask

A bit mask is a series of bits used to perform operations on another series of bits. A mask is commonly used to check for the existence of certain bits or remove/add certain bits to a value3, 4. For example, let’s say we have a series of bits 010. If we wanted to toggle on the first bit in the previous sequence, we can use the bit mask 001 and perform a bitwise OR operation. The result of 010 | 001 is 011, so the first bit was successfully toggled on by the bit mask.

The typedef struct code defines a new data type that has multiple fields - a users first and last name along with a bit field containing their status5. The bit field represents a users privileges/status on an online website.

Next I wrote some code to declare a new user type and set its bit field to 000.

int main() { User andy; strcpy(andy.first, "Andrew"); strcpy(andy.last, "Jarombek"); // Set the status as 000 andy.statusField = 0; ... }

Now comes the interesting part. I set up some functions to alter bits in the bit field using bit masks.

/** * Add a status to the bit field on a user type * @param user - a User type * @param mask - the status to add to the bit field */ void addStatus(User *user, unsigned int mask) { user->statusField |= mask; } /** * Remove a status from the bit field on a user type * @param user - a User type * @param mask - the status to add to the bit field */ void removeStatus(User *user, unsigned int mask) { user->statusField &= ~mask; } /** * Check to see if a status is in the bit field * @param bitField - a bit field to search through * @param mask - a bit mask to bitwise 'and' with the bit field * @return 0 if the mask doesn't exist in the bit field, an integer >= 1 otherwise */ int containsStatus(unsigned int bitField, unsigned int mask) { return bitField & mask; }

To turn on bits in a bit field, the bitwise OR operation is performed between the bit field and the mask. For example, 000 | 010 results in the middle bit being turned on - 010.

To turn off bits in the bit field, the bitwise AND and NOT operators are used. For example, 011 & ~010 turns off the middle bit, resulting in 001.

To check if a bit is turned on in a bit field, a simple bitwise AND is used between the bit field and the mask. If the value resulting from the bitwise AND is greater than 0, the bit is turned on in the field. Otherwise, its turned off. For example, 011 & 010 checks if the middle bit is turned on. Since the result is 010 or 2, we know that the bit is turned on in the bit field.

The bit masks are the preprocessor macros VALIDATED, SUBSCRIBED, and ADMIN. They are used in the following code to turn on bits, turn off bits, and check for existence of bits in the bit field.

... // Set the status as 000 andy.statusField = 0; printUser(andy); // Validate the user, change the status from 000 to 001 addStatus(&andy, VALIDATED); printUser(andy); // Give the user admin rights, change the status from 001 to 101 addStatus(&andy, ADMIN); printUser(andy); // Take away the admin rights to the user, change the status from 101 to 001 removeStatus(&andy, ADMIN); printUser(andy); // Give the subscribed status, change the status from 001 to 011 addStatus(&andy, SUBSCRIBED); printUser(andy); if (containsStatus(andy.statusField, VALIDATED)) { printf("The User IS Validated \n"); } else { printf("The User IS NOT Validated \n"); } if (containsStatus(andy.statusField, SUBSCRIBED)) { printf("The User IS Subscribed \n"); } else { printf("The User IS NOT Subscribed \n"); } if (containsStatus(andy.statusField, ADMIN)) { printf("The User IS an Admin \n"); } else { printf("The User IS NOT an Admin \n"); }
Andrew, Jarombek, 0 Andrew, Jarombek, 1 Andrew, Jarombek, 5 Andrew, Jarombek, 1 Andrew, Jarombek, 3 The User IS Validated The User IS Subscribed The User IS NOT an Admin

The full code for the C implementation can be found here.

Bit fields can also be used in Java. The following code for a User class defines static final variables for the bit masks. Note that in Java all integers are signed and 32 bits long, so you have 31 bits to play with in a bit field data structure.

public final class User { /* Flags for different statuses */ static final int VALIDATED = 1; static final int SUBSCRIBED = 2; static final int ADMIN = 4; private int statusBitField; private String first; private String last; // Package-private constructor for the user. The bit field is defaulted to a value of 0. User(String first, String last) { statusBitField = 0; this.first = first; this.last = last; } // private constructor that takes in a bit field. private User(String first, String last, int bitField) { this.statusBitField = bitField; this.first = first; this.last = last; } // Add a status to the users bit field static User addStatus(User user, int status) { int updatedStatusBitField = user.statusBitField | status; return new User(user.first, user.last, updatedStatusBitField); } // Remove a status from the users bit field static User removeStatus(User user, int status) { int updatedStatusBitField = user.statusBitField & ~status; return new User(user.first, user.last, updatedStatusBitField); } // Perform an action (supplied by the second argument) when given a boolean // representing the existence of a status for the user. void containsStatus(int mask, Consumer<Boolean> action) { action.accept(containsStatus(mask)); } // Check for the existance of a status in a bit field boolean containsStatus(int mask) { return (statusBitField & mask) > 0; } }

The following code performs operations on the user class, similar to the C implementation.

// Base user has a bit field of value 000 User user = new User("Andy", "Jarombek"); // Add the validated flag to the bit field: 000 -> 001 User validatedUser = User.addStatus(user, User.VALIDATED); // Add the admin flag to the bit field: 001 -> 101 User adminUser = User.addStatus(validatedUser, User.ADMIN); // Remove the admin flag from the bit field: 101 -> 001 User revokedAdminUser = User.removeStatus(adminUser, User.ADMIN); // Add the subscribed flag to the bit field: 001 -> 011 User subscribedUser = User.addStatus(revokedAdminUser, User.SUBSCRIBED); System.out.println(user); System.out.println(validatedUser); System.out.println(adminUser); System.out.println(revokedAdminUser); System.out.println(subscribedUser); // Check to see if the user has different statuses based on the bit field subscribedUser.containsStatus(User.VALIDATED, (bool) -> System.out.println("User is Validated: " + bool)); subscribedUser.containsStatus(User.SUBSCRIBED, (bool) -> System.out.println("User is Subscribed: " + bool)); subscribedUser.containsStatus(User.ADMIN, (bool) -> System.out.println("User is Admin: " + bool));
Andy, Jarombek, 0 Andy, Jarombek, 1 Andy, Jarombek, 5 Andy, Jarombek, 1 Andy, Jarombek, 3 User is Validated: true User is Subscribed: true User is Admin: false

The full code for the User class and Main class utilizing bit fields is on GitHub.

While you can use bit fields in Java like the previous implementation, using primitive int variables for bit fields goes against some of Java’s core principles. Using int primitives for bit fields is not type safe - which is often a crucial requirement for a Java data structure6. Luckily there is a type safe approach for bit fields in Java. This approach uses the EnumSet collection to store the fields. The fields themselves are represented as enums. This approach is not only type safe and more readable, but also has the same performance as the non-safe integer approach.

The code for the EnumSet approach is on GitHub in the EnumSetUser and EnumSetMain classes.

January 31st, 2019

It's also easy to create bit fields with enums in C#. You can check out the C# code on GitHub.

Researching bit fields was a really valuable exercise to gain further knowledge of bitwise operations and low level data structures. Expect more discoveries on Java and C in the future.

[1] "Bit field", https://en.wikipedia.org/wiki/Bit_field

[2] Brian W. Kernighan & Dennis M. Ritchie, The C Programming Language, 2nd ed (Upper Saddle River, NJ: Prentice Hall, 1988), 89

[3] "What is Bit Masking?", https://stackoverflow.com/a/10493604

[4] "What is a bitmask and a mask?", https://stackoverflow.com/a/31576303

[5] Kernighan., 146

[6] "Using Bit Flags and EnumSets in Java", https://eddmann.com/posts/using-bit-flags-and-enumsets-in-java/