It’s been nearly two months since my last post. I was seriously low on motivation. I thought I’d try changing the format a little back: add a little more personal narrative and write about things I’m currently reading instead of something I read a while back and must research again.
Let’s start with some bit hacks. I’m currently reading this book Hacker’s Delight. Make no mistake, you are probably never going to use this stuff in your professional code. The first law of programmer creativity is that “The cost of software maintenance increases with the square of the programmer’s creativity.”. But it’s a good exercise for your brain cells and it’s nice to know that you can understand your code down to the bit-level. So let’s do some exercises.
1. Isolate the rightmost set (1) bit i.e., keep the rightmost set bit and clear everything else. If the input is 0, do nothing. Ex: 01011010 => 00000010.
x & (-x)
So how does it work?
To understand it more intuitively, recognize that (-x) is the 2’s complement of x. Remember that one way to look at 2’s complement is: as you iterate over the bits from right-to-left, find the first set bit and invert everything to the left of it. So x & (-x) will have at most one set bit – the rightmost set bit. For the more observant, we use this expression for queries/updates in Fenwick tree/BIT.
2. Create a word with a single 1-bit at the position of the rightmost 0-bit. If the input is all 1’s, output 0. Ex: 01011011 => 00000100.
~x & (x + 1)
This is a little trickier. ~x is bitwise complement of x. (x + 1) sets the rightmost 0-bit and clears the 1’s on its right. So ~x & (x + 1) has at most one set bit – the rightmost 0-bit. If x has all 1’s, (x + 1) causes an overflow of the 1-bit and the result is all 0’s.
3. Create a word with a single 0-bit at the position of the rightmost 1-bit. If the input is all 0’s, output all 1’s. Ex: 01011000 => 11110111.
~x | (x – 1)
‘Recognize that (x – 1) clears the rightmost 1-bit and sets the 0’s on its right. If you compare it against ~x, both of them have a 0 at the position of the rightmost 1-bit in x but differ in every bit to the left of it and x-1 has all 1’s to the right of it. So ~x | (x – 1) has at most one clear bit – the rightmost 1-bit in x. If x has all 0’s, both ~x and x-1 are all 1’s so the result is also all 1’s.
4. Turn off the rightmost contiguous string of 1’s. Ex: 01011100 => 01000000.
((x & -x) + x) & x
That takes some wrapping your head around. We have already seen that (x & -x) isolates the rightmost 1-bit. If we add it to x, we would clear the rightmost contiguous string of 1’s but also set the bit to the immediate left of that string. We can clear that newly set bit by anding with x. If there was no bit to the left of the rightmost contiguous 1-string, there would be an overflow when we add x and we’ll still clear that string.
5. Find the next higher number containing the same number of set bits. Ex: 01011100 => 01100011.
a ← x & -x
b ← a + x
r ← b | (((x ⊕ b) >> 2) / a)
Holy shit! I know. I felt the same way. Let’s take a deep breadth and break it down into more manageable chunks.
a ← x & -x is the rightmost 1-bit isolated.
b ← a + x is x with rightmost 1-string cleared and the bit on the immediate left of it set.
x ⊕ b: x and b differ in only the bits in the rightmost 1-string in x and the bit on the immediate left of it, so x ⊕ b will have only these bits set.
Suppose n is the size of the rightmost 1-string in x. We should set the bit on the immediate left of it, clear this string and set the lowest n-1 bits. That’s the next higher number containing the same number of set bits. x ⊕ b has n+1 1’s. We can right-adjust it by dividing by a (a is a power of 2) and shifting it right two more times to get rid of the two extra set bits. This is what the expression (((x ⊕ b) >> 2) / a) computes. Oring it with b gives us the desired result. Remember b was x with rightmost 1-string cleared and the bit on the immediate left of it set. This expression has the lowest n-1 bits set and everything else cleared. Oring them together gives us what we wanted.
Pretty cool huh! What’s cool is that it actually has a real application. You can use bit strings to represent subsets. The actual members in the universe are stored in an array and the positions of set bits in a bit string are the indices of members in the subset represented by that bit string. If you wanted to iterate over all subsets, you could iterate from 0 to 2n – 1, where n is the size of the universe. Now if you wanted to iterate over all subsets of a given size, you can use a function that maps a number to the next higher number containing the same number of set bits, precisely what we built.
A few caveats:
- The case x = 0 has to be handled separately because it causes division by 0.
- The right shift and division operations should be unsigned, with no sign extension.
- If there is no next higher number i.e., x is the highest number you can get by rearranging its set bits, then this procedure reduces the number, which can be used as the stopping condition for iteration over subsets.
Looking forward to your ideas and suggestions in comments.