Re: tables as parsers

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: tables as parsers
From: "Soni \"They/Them\" L." <fakedme@...>
Date: Thu, 28 Mar 2019 17:26:02 -0300



On 2019-03-28 10:21 a.m., Coda Highland wrote:

On Wed, Mar 27, 2019 at 9:06 PM Soni "They/Them" L. <fakedme@gmail.com<mailto:fakedme@gmail.com>> wrote:




    On 2019-03-27 10:10 p.m., Coda Highland wrote:
    > On Wed, Mar 27, 2019 at 6:47 PM Soni "They/Them" L.
    <fakedme@gmail.com <mailto:fakedme@gmail.com>
    > <mailto:fakedme@gmail.com <mailto:fakedme@gmail.com>>> wrote:
    >
    >     I'm rolling with this weird idea. is it possible to make a
    simple
    >     parser
    >     out of tables? I tried some stuff but ran into a few issues.
    I don't
    >     wanna use LPeg or anything more complicated than
    string.find(foo,
    >     bar,
    >     1, true).
    >
    >     e.g.:
    >
    >     local ltk = {} -- lua_tokens
    >     ltk.string = {}
    >     ltk['"'] = "string"
    >     ltk["'"] = "string" -- how to handle end of string correctly?
    >     ltk.string['\\'] = "escape"
    >     ltk.string.escape = {}
    >     ltk.string.escape['z'] = {}
    >     ltk.string.escape['z'].next = ltk.string.escape['z']
    >     for i, v in ipairs({" ", "\t", "\n", etc}) do
    >        ltk.string.escape['z'][v] = "next"
    >     end
    >     ltk.string.escape['z'][''] = ltk.string -- not sure if
    there'd be a
    >     better way to do this
    >     -- etc
    >
    >
    > Not a bad start for someone who's never done this formally before.
    >
    > There are a zillion different ways to define a parser. The
    Dragon Book
    > recommendation is well-given.
    >
    > I think the insight you're missing is that the table really
    represents
    > the structure of a state machine. Your top-level table shouldn't
    > contain things like ["'"] = "string" directly; rather, you'd want
    > something like ltk.base["'"]. Then you wouldn't have
    > ltk.string.escape, you'd just have ltk.string and ltk.escape. The
    > nesting doesn't belong in the table itself; instead, your parser
    will
    > maintain a stack of states, and you push a state when you start
    into a
    > parsing rule and you pop it off when you've finished that rule.

    Yeah but then I wouldn't have self-referential tables. Besides I
    wasn't
    planning on having a stack. FSMs (same class as regex) don't have
    a stack.

FSMs can't handle parentheses. String escapes are about the limit ofwhat they can manage.

FSMs can't handle an arbitrary number of parentheses. But you can verymuch create an FSM with a nesting depth limit. It's just gonna be... a*little* big...

    >
    > The place where this is suddenly going to get complicated is if
    your
    > grammar contains any prefix ambiguities (e.g. "fork" vs "forward"),
    > because then you have to implement backtracking. For most simpler
    > grammars it's best to just break it into two passes -- one that
    makes
    > simple lexical tokens out of a string of characters, and one that
    > evaluates the syntax out of a list of lexical tokens.

    "fork" vs "forward" is unambiguous and doesn't require backtracking.


Derp. That's my mistake. You're right, that one's good.

/s/ Adam

References:
- tables as parsers, Soni "They/Them" L.
- Re: tables as parsers, Coda Highland
- Re: tables as parsers, Soni "They/Them" L.
- Re: tables as parsers, Coda Highland

Prev by Date: Re: tables as parsers
Next by Date: Re: tables as parsers
Previous by thread: Re: tables as parsers
Next by thread: Re: tables as parsers
Index(es):
- Date
- Thread