
It was thus said that the Great Eduardo Bart once stated:
> Counting number of bits in what context? The library doesn't have this
> function, I do shift in one bit in integer pow functions but it's not much
> with the intention of counting bits.
>
> What I would like to optimize more is the conversion from base 2 to base 10
> in the library, it is used when printing big numbers in base 10 as string.
> My `tobase` function calls too many times `udivmod` (unsigned integer
> division with modulus) for really large integers (8192 bits or so) and this
> make tostring very slow for big numbers. If you have any ideas how I could
> optimize that base conversion (or perhaps udivmod) I would appreciate it.
You want to as large of a division as possible. On a 32bit system, you
want to do a "64 bits divided by 32 bits" repeatedly, and on a 64bit
system, you'll want to do a "128 bits divided by 64 bits" repeatedly [1].
For best results, you would need to hit assembly, but a reasonable C
compiler will probably produce decent results with code like:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <assert.h>
#define MAX 256
unsigned long xremainder(
unsigned long dest[MAX],
unsigned long const src[MAX],
unsigned long demon
)
{
unsigned long long carry = 0;
unsigned long long quo;
unsigned long long rem;
size_t i;
for (i = 0 ; i < MAX ; i++)
{
carry <<= (sizeof(long) * CHAR_BIT);
carry = src[i];
quo = carry / demon;
rem = carry % demon;
dest[i] = quo;
carry = rem;
}
return rem;
}
int main(void)
{
unsigned long src[MAX];
unsigned long dest[MAX];
unsigned long zero[MAX];
char buffer[BUFSIZ];
char *p;
assert(sizeof(unsigned long long) > sizeof(unsigned long));
memset(zero, 0,sizeof(zero));
memset(src, 255,sizeof(src)); /* maximum number */
p = &buffer[BUFSIZ];
*p = '\0';
do
{
unsigned long rem = xremainder(dest,src,10);
assert(rem <= 9);
*p = rem + '0';
memcpy(src,dest,sizeof(src));
} while(memcmp(src,zero,sizeof(zero)));
puts(p);
return 0;
}
If you can somehow do "larger" than "bitatatime" division, that should
speed up the code.
spc
[1] All the architectures I'm familiar with that have a DIV instruction
will allow one to divide a number twice the "word size" (32 bits on
a 16 bit machine, 64 bits on a 32 bit machine, 128 bits on a 64 bit
machine) by the word size.
Conversely, the MUL instruction will usually result in a result
twice the normal word size (even the 8bit 6809, which has an 8bit
MUL instruction will produce a 16bit result).