[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: problem with string.format %d and very large integers
- From: Lorenzo Donati <lorenzodonatibz@...>
- Date: Fri, 05 Aug 2011 01:21:03 +0200
On 04/08/2011 18.23, Roberto Ierusalimschy wrote:
What are your thoughts about using int64? Not ANSI? Not performant?
Both. Moreover, it does not solve all cases. For instance, fseek is
limited to long in C.
Ouch. That's bad!
If the problem is widespread, I wouldn't argue that 'format' should
get special privileges. With table indices, you did such a beautiful
job providing "general semantics with good performance for ints" that
I though this might be your goal everywhere.
As pointed out in the case of fseek, the big problem is the interface
with C. And documenting everything would be messy; the limits are not
even the same for every function (e.g., int x long x size_t).
Although it was me who came with the "widespread" argument, I guess it
is not that widespread in pragmatic terms. As Dirk pointed out, it is
rare (or impossible, in Lua) to have too large structures, so in real
programs seldom (or never) a real index will be larger than 2^31 or
whatever the limit. It may be more useful to collect real scenarios
where this problem actually happens (e.g., format, which applies to
generic numbers, not only indices or sizes) and try to solve them
(e.g., with error messages).
After a more careful browsing of the Lua functions, I came to the
conclusion that probably most problems are just theoretical, because of
the actual impossibility in the near future to have 2Gb+ of data in main
memory in a real program.
The only two concrete bottlenecks are format %d and seek. Well, for the
former, since high performance is not probably needed, an added check or
some smart workaround (at the cost of some VM cycles) seem acceptable.
The seek problem seem, IMHO, more fundamental. As you point out
rightfully, it is not Lua's fault. OTOH sticking to ISO C here severely
hampers the ability of Lua of handling very large dataset on disk
piecewise. I cannot provide a concrete use case for lack of direct
experience, but it doesn't seem unlikely that someone would want to use
Lua for such tasks.
Would be acceptable to include a workaround so that one could enable
support for such large files (someone mentioned fseek64) just using an
option in luaconf.h, say #define USE_FSEEK64?
If it is feasible with little effort, I think it would very useful,
since disk size keeps growing and applications needing to sift huge
amount of data are ever more widespread. I don't see why, e.g., someone
couldn't embed Lua in an application for video post-processing, which
usually work such huge files. Or disk imaging software, just to provide
another possible scenario.