lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On 27/07/16 12:34 PM, Patrick Donnelly wrote:
Soni,

On Wed, Jul 27, 2016 at 10:27 AM, Soni L. <fakedme@gmail.com> wrote:
Sean Conner's incomplete solution uses lpeg.Carg(1) which doesn't work
recursively.

This is what I'm trying to parse: https://github.com/SoniEx2/MDXML

This is what I currently have:
[...]
Most people don't have time to grok your code. Can you give a smaller
example/explanation of what you're trying to do? It sounds like you
may want to use group and back captures to carry information in
recursive matching. Not sure though...

Basically (really oversimplifying here) I have:

tag = # <stuff> \n
tagns = #### <stuff> \n
attr = ## <stuff> \n
val = ### <stuff> \n
atrrns = ##### <stuff> \n
content = > <stuff> [\n > <stuff>] \n \n

and I want to capture things like:

tag and {
  [0] = tagns and {tag=tag.stuff, ns=tagns.stuff} or tag.stuff,
  [attr.stuff] = attrns and {val=val.stuff, ns=attrns.stuff} or val.stuff,
  RECURSION_POINT
} or content.stuff

note that RECURSION_POINT repeats 0 or more times.

E.g. the following MDXML document:

##version
###1.1
##encoding
###utf-8

`This is a list of programming languages and programming language books, in MDXML`

#programming
#languages
> #language
> > #name
> > > Lua
> >
> > #link
> > > http://www.lua.org/
>
> #language
> > #name
> > > Python
> >
> > #link
> > > https://www.python.org/

# books
> #book
> ##edition
> ###3
> > #name
> > > Programming in Lua
> >
> > #ISBN
> > > 859037985X

Should produce the following table: (The weird table structure is so each line matches the equivalent line in the MDXML.)

{ [0] = 'programming',
  { [0] = 'languages',
    { [0] = 'language',
      { [0] = 'name',
        'Lua'
      },
      { [0] = 'link',
        'http://www.lua.org/'
      } },
    { [0] = 'language',
      { [0] = 'name',
        'Python'
        },
      { [0] = 'link',
        'https://www.python.org/'
      } } },
  { [0] = 'books',
    { [0] = 'book',
      ['edition'] =
      '3',
      { [0] = 'name',
        'Programming in Lua'
      },
      { [0] = 'ISBN',
        '859037985X'
      } } } }

And the following MDXML document:

##version
###1.1
##encoding
###utf-8

#root
> #a
> #b
> #c
> stuff

Should produce:

{ [0] = 'root',
  {[0] = 'a'},
  {[0] = 'b'},
  {[0] = 'c'},
  'stuff'
}

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.