lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On 30/09/16 05:38 PM, Egor Skriptunoff wrote:
On Fri, Sep 30, 2016 at 11:12 PM, Soni L. <fakedme@gmail.com <mailto:fakedme@gmail.com>> wrote:



            I'd suggest to redefine the interpretation of zero value for
            "index_from" argument of string:sub.
            The documentation may look like the following:

            Positive indexes 1, 2, 3... count from the beginning of
        the string.
            Negative indexes -1, -2, -3,... count backward from the
        end of the
            string.
            Index 0 has special meaning, its semantic is different for
            "index_from" and "index_to" arguments:
            index_from = 0 means "the index after the last character
        of the
            string"
            index_to = 0 means "the index before the first character
        of the
            string"
            In other words, the "rule of maximum restriction" holds.
            There are two related Lua idioms: for any non-negative N
        prefix of
            length N is ":sub(1, N)" and suffix of length N is ":sub(-N)".


        "s:sub(k,0)" and "s:sub(0,k)" are always empty strings for any
        s and k
        Zero index always "kills" the substring.
        This property is quite understandable for beginners,
        descriptive and memorable.


    Which is stupid if you consider all the things from numeric for,
    table.move, table.unpack, table.concat, etc. Remember, just
    because strings don't support index 0, doesn't mean they aren't
    consistent with tables. Even # is consistent between strings and
    tables (however, strings don't have a concept of nil).

What consistency you are talking about?
Unlike the "string" library, "table" library DOES NOT have the possibility to use negative indexes for counting backwards from the end of a table. For example, table.unpack(t, -2) DOES NOT return two last elements of a table.
There is no "consistency" that may be broken.
And there is nothing to change in implementation of "table" library.

There are many ways to look at this problem.

Technically speaking, a purist might even argue that ("a"):sub(1,3) should result in "aaa" (e.g. me). Or that ("abc"):sub(1,4) should result in "a".

Also, don't forget select(). Never forget select(). select() even takes the string "#" as a valid index.

But hey, select(), strings and files all deviate from tables, which are the standard. Note how select, strings and files have no meaningful concept of negative indices - they're exceptions. Thus, handling negative indices exceptionally makes sense. But you should still aim for consistency wherever possible.

I don't know what the Lua devs think. But it seems they do everything to keep things consistent, only applying exceptions in exceptional circumstances.

You might argue select(0) should return the length, instead of select("#"). Or even select("n") for consistency with table.pack(). You might argue table.pack() should have ["#"] as the length, not ["n"] - for consistency with select().

I'd even go as far as saying str:sub(0,0) should return the full string, unmodified. But that'd be only to fully oppose your argument.

Also, there's a fine line between a purist and a... someone who preaches consistency. And idk where it is.

    And don't forget, it'd also require a change to string.find,
    string.gsub, string.gmatch, string.match, string.byte, etc
    basically anything that takes an index.

Yes, "string.byte" should be modified the same way as "string.sub" as it uses the same rules for specifying an interval of indexes. But there is absolutely nothing nothing to change in implementations of string.find, string.gsub, string.gmatch and string.match.


string.find, string.gsub, string.gmatch and string.match all take a starting index, which can be negative.

PS: My personal favorite would be to just error() if the starting index was 0. Would help catch tons of off-by-ones.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.