[LPeg] How can I parse a subset of markdown?

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: [LPeg] How can I parse a subset of markdown?
From: "Soni L." <fakedme@...>
Date: Fri, 22 Jul 2016 12:36:57 -0300

http://stackoverflow.com/q/38514522/3691554

I'm trying to parse a subset of markdown into a tree with LPeg. The ideais simple but I'm not sure what I'm doing. The whole spec for the thingI'm doing is here[1] and yes, that's a master branch github link, thereare still some things I need to work out.

So, the basic idea is that I have: (put in a code block because that'sthe only thing that does preformatted text/doesn't strip spaces here)


    `> ` blocks, where the space is (greedy, non-backtracking) optional, as
        in `lpeg.P(">") * lpeg.P(" ")^-1`.

` ` "blocks", behave like in markdown (i.e. everything until theend of

        the line is not interpreted as markdown).

#-###### "blocks", behave like in GFM (i.e. what follows is notinterpreted,except for inline elements). This is easy, with somethingsimilar to:

--

local header = (lpeg.P("#") * lpeg.P("#")^-5 * lpeg.C(non_eol^1)) /process_header_elements

    --

        (it's much easier to use a function capture here than doing it in
        pure LPeg.)

triple-` blocks, these are trivial. they're inspired by githubmarkdown.

single-` "blocks", these are supported as inline elements in hashblocks.

And I think that describes the whole thing really. My main issue iscombining all the parts together, not the individual parsing of eachpart. Then I need to collect it all into a table, which should also bepretty easy.

(Now that I look at it I see that MDXML is *so* much simpler thanmarkdown that you can probably parse the whole thing with a singleregex. But regex doesn't let me collect into a table like I want.)


[1]: https://github.com/SoniEx2/MDXML/blob/master/README.md

PS: This post may look like shit, I copypasted it from SO.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.

Follow-Ups:
- Re: [LPeg] How can I parse a subset of markdown?, Dirk Laurie
- Re: [LPeg] How can I parse a subset of markdown?, Sean Conner

Prev by Date: Re: Let's talk about __call
Next by Date: Re: LuaConf
Previous by thread: Re: Let's talk about __call
Next by thread: Re: [LPeg] How can I parse a subset of markdown?
Index(es):
- Date
- Thread