Re: [ANN] bitlib release 22

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: [ANN] bitlib release 22
From: eugeny gladkih <john@...>
Date: Fri, 09 Nov 2007 20:39:57 +0300

>>>>> "MP" == Mike Pall <mikelu-0711@mike.de> writes:

 >>> Oh, and you realize that converting doubles to/from unsigned
 >>> integers is _dead slow_ on many platforms (but not necessarily so
 >>> for signed integers).
 >> 
 >> On which platforms?

 MP> $ cat conv.c
 MP> #include <stdint.h>
 MP> intmax_t d2i(double x) { return (intmax_t)x; }
 MP> double i2d(intmax_t x) { return (double)x; }
 MP> uintmax_t d2u(double x) { return (uintmax_t)x; }
 MP> double u2d(uintmax_t x) { return (double)x; }

 MP> Check the assembler output with:
 MP>   cc -Os -fomit-frame-pointer -S -o - conv.c

 MP> On x86 (x87 FP):

 MP> d2i: basically fld + fistp, but rounding is set to truncation mode and back
 MP> i2d: fild qword [esp+4]      // One instruction!
 MP> d2u: call __fixunsdfdi       // Ouch!!
 MP> u2d: two cases for +/- and some bias tricks

 MP> On x64 (SSE FP):

 MP> d2i: cvttsd2si rax, xmm0     // One instruction!
 MP> i2d: cvtsi2sd  xmm0, rax     // One instruction!
 MP> d2u: two cases for +/- with FP comparison and some bias tricks
 MP> u2d: two cases for +/- and some bit shifting tricks

 MP> Don't have a PPC host at the moment, but AFAIR the situation was
 MP> pretty similar.

Sparc V9 for contrast:

d2i:
        add     %sp, -208, %sp
        fdtox   %f0, %f0
        std     %f0, [%sp+2231]
        ldx     [%sp+2231], %o0
        jmp     %o7+8
         sub    %sp, -208, %sp

i2d:
        add     %sp, -208, %sp
        stx     %o0, [%sp+2231]
        ldd     [%sp+2231], %f8
        sub     %sp, -208, %sp
        jmp     %o7+8
         fxtod  %f8, %f0

d2u:
        .register       %g2, #scratch
        add     %sp, -208, %sp
        mov     543, %g1
        sllx    %g1, 53, %g1
        stx     %g1, [%sp+2231]
        ldd     [%sp+2231], %f8
        fcmped  %fcc0, %f0, %f8
        fbge,a,pt %fcc0, .LL6
         fsubd  %f0, %f8, %f8
        fdtox   %f0, %f0
        std     %f0, [%sp+2239]
        ba,pt   %xcc, .LL7
         ldx    [%sp+2239], %o0
.LL6:
        mov     1, %g1
        fdtox   %f8, %f8
        sllx    %g1, 63, %g1
        std     %f8, [%sp+2239]
        ldx     [%sp+2239], %g2
        xor     %g2, %g1, %o0
.LL7:
        jmp     %o7+8
         sub    %sp, -208, %sp

u2d:
        brlz,pn %o0, .LL10
         add    %sp, -208, %sp
        stx     %o0, [%sp+2231]
        ldd     [%sp+2231], %f8
        ba,pt   %xcc, .LL11
         fxtod  %f8, %f0
.LL10:
        and     %o0, 1, %g1
        srlx    %o0, 1, %g2
        or      %g2, %g1, %g2
        stx     %g2, [%sp+2231]
        ldd     [%sp+2231], %f10
        fxtod   %f10, %f8
        faddd   %f8, %f8, %f0
.LL11:
        jmp     %o7+8
         sub    %sp, -208, %sp


too complicated I guess

-- 
Yours sincerely, Eugeny.
Doctor Web, Ltd. http://www.drweb.com

References:
- [ANN] bitlib release 22, Reuben Thomas
- Re: [ANN] bitlib release 22, Mike Pall
- Re: [ANN] bitlib release 22, Matt Campbell
- Re: [ANN] bitlib release 22, Mike Pall

Prev by Date: Re: Switch/Case statements revisited
Next by Date: Re: Switch/Case statements revisited
Previous by thread: Re: [ANN] bitlib release 22
Next by thread: Re: [ANN] bitlib release 22
Index(es):
- Date
- Thread