Monday, September 14, 2009

Bit Counting (Continued)

So here's the pattern. You're trying to populate a table so that each entry in the table contains the number of bits set in a one byte binary representation of that entry's index:

0 = 00000000 → 0 bits set
1 = 00000001 → 1 bit set
2 = 00000010 → 1 bit set
3 = 00000011 → 2 bits set
4 = 00000100 → 1 bit set
5 = 00000101 → 2 bits set
6 = 00000110 → 2 bits set
7 = 00000111 → 3 bits set
8 = 00001000 → 1 bit set
and so on.

So the first two are obviously 0 and 1. The next two are the same, but with the 00000010 bit set. Then the next four follow the same pattern but with the 00000100 bit set. The next eight follow the pattern of the previous eight, but with 00001000 set.

So to populate the table, you basically put 0 in the first entry, and then repeatedly copy all the previous entries, incrementing by 1.

Actually, there's a more concise way to write this I'll post later.

If I had thought of that more quickly, I might have gotten that job in a start-up that was later bought by a large software company. Who knows?

Is it any wonder I'm such a curmudgeon?

Friday, September 11, 2009

Bit Counting

A few years ago, I was on a job interview, and the interviewer asked me how to write a function to count the number of '1' bits in a byte. I replied that this sounded like an obvious table lookup problem. Just create the table with 256 entries, and use the byte value itself to index to the correct bit count.

That seemed to take him a little off guard, but he quickly asked how to populate the table. He was still trying to get me to answer his original question with the explanation of how to test each bit by shifting the byte successively and ANDing with 1. If the shifted byte ANDed with 1 was 1, then the low bit was on, and the bit count for that byte should be incremented.

That might be ok for counting bits in one or two bytes, but it's a pretty lame way to populate a table. I knew there had to be a better way. The lame approach would require 2048 tests and 1792 shifts. Obviously there's a pattern to the number of bits set in values ranging from 0 to 255, and I wanted to take advantage of that. (Yes, you could skip the shift and just AND each byte 8 times, once with each of 8 different masks. That's still 2048 ANDs and tests.)

Unfortunately, I was so distracted by this that I didn't answer the question right away, and I didn't get the job. Actually, I don't know if this was the deciding factor or not, but it bugged me. I always get my best answers on the way home from the interview, and this was a prime example of that.

Got it?