|
On 04/08/2011 18.23, Roberto Ierusalimschy wrote:
What are your thoughts about using int64? Not ANSI? Not performant?Both. Moreover, it does not solve all cases. For instance, fseek is limited to long in C.
Ouch. That's bad!
If the problem is widespread, I wouldn't argue that 'format' should get special privileges. With table indices, you did such a beautiful job providing "general semantics with good performance for ints" that I though this might be your goal everywhere.As pointed out in the case of fseek, the big problem is the interface with C. And documenting everything would be messy; the limits are not even the same for every function (e.g., int x long x size_t).
Fair point.
Although it was me who came with the "widespread" argument, I guess it is not that widespread in pragmatic terms. As Dirk pointed out, it is rare (or impossible, in Lua) to have too large structures, so in real programs seldom (or never) a real index will be larger than 2^31 or whatever the limit. It may be more useful to collect real scenarios where this problem actually happens (e.g., format, which applies to generic numbers, not only indices or sizes) and try to solve them (e.g., with error messages).
After a more careful browsing of the Lua functions, I came to the conclusion that probably most problems are just theoretical, because of the actual impossibility in the near future to have 2Gb+ of data in main memory in a real program.
The only two concrete bottlenecks are format %d and seek. Well, for the former, since high performance is not probably needed, an added check or some smart workaround (at the cost of some VM cycles) seem acceptable.
The seek problem seem, IMHO, more fundamental. As you point out rightfully, it is not Lua's fault. OTOH sticking to ISO C here severely hampers the ability of Lua of handling very large dataset on disk piecewise. I cannot provide a concrete use case for lack of direct experience, but it doesn't seem unlikely that someone would want to use Lua for such tasks.
Would be acceptable to include a workaround so that one could enable support for such large files (someone mentioned fseek64) just using an option in luaconf.h, say #define USE_FSEEK64?
If it is feasible with little effort, I think it would very useful, since disk size keeps growing and applications needing to sift huge amount of data are ever more widespread. I don't see why, e.g., someone couldn't embed Lua in an application for video post-processing, which usually work such huge files. Or disk imaging software, just to provide another possible scenario.
-- Roberto
-- Lorenzo