lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

If it 'only' means the FPU computes things differently internally, why does Lua start crashing as soon as DX sets this flag to single-precision? :-)

Using double-precision with DX means the driver will be setting/resetting the flag a lot (run-time) whilst this can be avoided simply by only using single-precision (compile-time). You make a good point DirectX itself would probably be fine on modern CPU-s if it wouldn't touch the flag at all, however this is not our choice; it simply is as it is.

A performance hit of 3.5% just for having some FPU flag in the correct state without even knowing if it actually really is used (SSE2) sounds like a very good reason not to use double-precision to me.


P.S.: design-flaws -> check out DirectShow. ;-)

----- Original Message ----- From: "Andras Balogh" <>
To: "Lua list" <>
Sent: Thursday, March 16, 2006 4:51 PM
Subject: Re: ANN: LuaJIT 1.1.0 / Re: Lua x DirectX

Having the FPU operate in full precision mode does not mean that you have to send doubles to the GPU. I'm using 32 bit floats for geometry too. This flag only means that internally, the FPU computes everything in single/double precision. Besides, this 580FPS is an ill test case, normal apps don't run at this rate. So it is probably limited by CPU, in which case, I think that setting and resetting the FPU registers in every API call probably hurts a bit (even then, 3.5% is nothing). If DirectX just didn't touch it, it would probably be fine. I don't think there's any speed difference internally. Modern FPUs do most instructions in one cycle.. Sure, if you do software transforms (god forbid rasterizing), then it _might_ give you an edge, but I just don't think it's worth the trouble. Besides, DirectX was designed by humans too, it's not without design flaws...


On Thu, 16 Mar 2006 07:35:18 -0700, Framework Studios: Hugo <> wrote:

Well, one number I found on the decrease of Direct3D's speed with and without the FPU preserve flag:

with: 560 fps
without: 580 fps

However I think it is a bit beside the point to 'prove' this with numbers since DirectX more or less already chose single precision for us (for a good reason, I trust). Also it seems logical for a 3D API to be faster when using float-s in stead of double-s because twice the data can be pushed to the GPU with the same bandwidth / stored in VRAM. Isn't this the same for OpenGL?

Looking at the performance of double vs float on modern CPU-s should be interesting though. Are double-s faster, slower or the same compared to float-s on 32-bit and 64-bit CPU architecture? What about the CPU-s people are actually using on average at the moment? (to sell games we need to look at what is average on the market, not only to what is top-notch :)