• Subject: Re: [ANN] lua-bint - Arbitrary precision integer arithmetic library in pure Lua
• From: Sean Conner <sean@...>
• Date: Fri, 10 Jul 2020 19:51:42 -0400

```It was thus said that the Great Eduardo Bart once stated:
> Counting number of bits in what context? The library doesn't have this
> function, I do shift in one bit in integer pow functions but it's not much
> with the intention of counting bits.
>
> What I would like to optimize more is the conversion from base 2 to base 10
> in the library, it is used when printing big numbers in base 10 as string.
> My `tobase` function calls too many times `udivmod` (unsigned integer
> division with modulus) for really large integers (8192 bits or so) and this
> make tostring very slow for big numbers. If you have any ideas how I could
> optimize that base conversion (or perhaps udivmod) I would appreciate it.

You want to as large of a division as possible.  On a 32-bit system, you
want to do a "64 bits divided by 32 bits" repeatedly, and on a 64-bit
system, you'll want to do a "128 bits divided by 64 bits" repeatedly [1].
For best results, you would need to hit assembly, but a reasonable C
compiler will probably produce decent results with code like:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <assert.h>

#define MAX 256

unsigned long xremainder(
unsigned long       dest[MAX],
unsigned long const src[MAX],
unsigned long       demon
)
{
unsigned long long carry = 0;
unsigned long long quo;
unsigned long long rem;
size_t             i;

for (i = 0 ; i < MAX ; i++)
{
carry <<= (sizeof(long) * CHAR_BIT);
carry  |= src[i];
quo     = carry / demon;
rem     = carry % demon;
dest[i] = quo;
carry   = rem;
}

return rem;
}

int main(void)
{
unsigned long  src[MAX];
unsigned long  dest[MAX];
unsigned long  zero[MAX];
char           buffer[BUFSIZ];
char          *p;

assert(sizeof(unsigned long long) > sizeof(unsigned long));

memset(zero,  0,sizeof(zero));
memset(src, 255,sizeof(src)); /* maximum number */

p    = &buffer[BUFSIZ];
*--p = '\0';

do
{
unsigned long rem = xremainder(dest,src,10);
assert(rem <= 9);
*--p     = rem + '0';
memcpy(src,dest,sizeof(src));
} while(memcmp(src,zero,sizeof(zero)));

puts(p);
return 0;
}

If you can somehow do "larger" than "bit-at-a-time" division, that should
speed up the code.

-spc

[1]	All the architectures I'm familiar with that have a DIV instruction
will allow one to divide a number twice the "word size" (32 bits on
a 16 bit machine, 64 bits on a 32 bit machine, 128 bits on a 64 bit
machine) by the word size.

Conversely, the MUL instruction will usually result in a result
twice the normal word size (even the 8-bit 6809, which has an 8-bit
MUL instruction will produce a 16-bit result).

```

• Follow-Ups:
• References: