The code in post 16 that will easily beat that, and take the same duration no matter what the input.
I’ve only measured PBP by disassembly and then counting the assembler instruction time.
Bitwise rotation of each byte is one instruction, shifting and other bitwise operations that have an input value should be two assembler instructions.
But yes, using the same technique as post #16 expressed in PBP will beat the assembler in post #6.
Consider the value you begin with begins at array location 0,
and your input is 16, so you need to rotate it 16 times:
Code:
00000000 00000000 00000000 00000001
^
arraylocation[0]
Divide the 16 by 8, and result is 2. Add 2 to the array location index and bingo:

The reason your 4 byte array has values around it is so you can still use the four bit value

The only thing that complicates it is that your input won’t often be evenly divisible by 2.
The rest of the code shifts the byte by the modulus.
You could still easily beat the asm rotation with a single PBP divide with modulus,
but PBP might not beat the particular implementation of asm divide and modulus!
I edited images in there because I wasn’t getting fixed width for spaces.
Bookmarks