[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] bitlib release 22
- From: eugeny gladkih <john@...>
- Date: Fri, 09 Nov 2007 20:39:57 +0300
>>>>> "MP" == Mike Pall <mikelu-0711@mike.de> writes:
>>> Oh, and you realize that converting doubles to/from unsigned
>>> integers is _dead slow_ on many platforms (but not necessarily so
>>> for signed integers).
>>
>> On which platforms?
MP> $ cat conv.c
MP> #include <stdint.h>
MP> intmax_t d2i(double x) { return (intmax_t)x; }
MP> double i2d(intmax_t x) { return (double)x; }
MP> uintmax_t d2u(double x) { return (uintmax_t)x; }
MP> double u2d(uintmax_t x) { return (double)x; }
MP> Check the assembler output with:
MP> cc -Os -fomit-frame-pointer -S -o - conv.c
MP> On x86 (x87 FP):
MP> d2i: basically fld + fistp, but rounding is set to truncation mode and back
MP> i2d: fild qword [esp+4] // One instruction!
MP> d2u: call __fixunsdfdi // Ouch!!
MP> u2d: two cases for +/- and some bias tricks
MP> On x64 (SSE FP):
MP> d2i: cvttsd2si rax, xmm0 // One instruction!
MP> i2d: cvtsi2sd xmm0, rax // One instruction!
MP> d2u: two cases for +/- with FP comparison and some bias tricks
MP> u2d: two cases for +/- and some bit shifting tricks
MP> Don't have a PPC host at the moment, but AFAIR the situation was
MP> pretty similar.
Sparc V9 for contrast:
d2i:
add %sp, -208, %sp
fdtox %f0, %f0
std %f0, [%sp+2231]
ldx [%sp+2231], %o0
jmp %o7+8
sub %sp, -208, %sp
i2d:
add %sp, -208, %sp
stx %o0, [%sp+2231]
ldd [%sp+2231], %f8
sub %sp, -208, %sp
jmp %o7+8
fxtod %f8, %f0
d2u:
.register %g2, #scratch
add %sp, -208, %sp
mov 543, %g1
sllx %g1, 53, %g1
stx %g1, [%sp+2231]
ldd [%sp+2231], %f8
fcmped %fcc0, %f0, %f8
fbge,a,pt %fcc0, .LL6
fsubd %f0, %f8, %f8
fdtox %f0, %f0
std %f0, [%sp+2239]
ba,pt %xcc, .LL7
ldx [%sp+2239], %o0
.LL6:
mov 1, %g1
fdtox %f8, %f8
sllx %g1, 63, %g1
std %f8, [%sp+2239]
ldx [%sp+2239], %g2
xor %g2, %g1, %o0
.LL7:
jmp %o7+8
sub %sp, -208, %sp
u2d:
brlz,pn %o0, .LL10
add %sp, -208, %sp
stx %o0, [%sp+2231]
ldd [%sp+2231], %f8
ba,pt %xcc, .LL11
fxtod %f8, %f0
.LL10:
and %o0, 1, %g1
srlx %o0, 1, %g2
or %g2, %g1, %g2
stx %g2, [%sp+2231]
ldd [%sp+2231], %f10
fxtod %f10, %f8
faddd %f8, %f8, %f0
.LL11:
jmp %o7+8
sub %sp, -208, %sp
too complicated I guess
--
Yours sincerely, Eugeny.
Doctor Web, Ltd. http://www.drweb.com