lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Andrew Starks once stated:
> On Tue, Nov 11, 2014 at 9:58 PM, <meino.cramer@gmx.de> wrote:
> 
> > I looked (and tried to understand ...) Robertos video about the
> > concepts of lpeg/peg on Youtube and I read the lpeg tutorial
> > lpeg pages.
> >
> > But still I am running against an inner wall...years of using regexps
> > could not be nullified that fast... ;)
> >
> > One sentence of Robertos video stick in my head: Lpeg do not search.
> >
> > How can I recognize the appearance of a certain pattern in a string
> > then?
> >
> > For example:
> >
> > mystinrg="CHaskellLispFortranBPCLAssemblerForthLuaSchemePerlCHILLTeXJavaJavascript"
> >
> > Is it possible to check with LPeg and without missusing it , wheter "Lua"
> > is in that string and where (I know, that there are other lpegless methods
> > to
> > do that ... thats why this is an example ;) ?
> >
> > If this not possible or only can be "tricked"...how can LPeg react on
> > only partly known input formats?
> >
> > Or is that a typical question of someone, who is still under the bad
> > influence of regexps?
>
> Yes you can search. With lpeg, there are at least two methods that I can
> think of. The one that doesn't use grammars can be described thusly:
> 
> zero or more characters that do not equal "Lua" followed by "Lua"
> 
> or
> 
> (untested)
> 
> local P = lpeg.P
> 
> local contains_lua_pat = (P(1) - "Lua")^01 * P("Lua")
> 
> print(contains_la_pat:match(your_string))
> --> position of the match

  That actually returns the position *just past* the match, so there's
little indication of what you matched, just that you did.

> If it succeeds, it returns the position of... i believe the first (and
> maybe also last) position of the match. If you use "lpeg.C", you'll get the
> capture.

  Here's a sample with captures:

-- ----------------------------------------------------------------
-- load LPeg, and grab some local references to two LPeg functions:
--
-- C()	- Return the text comprising the pattern [1]
-- Cp() - Return the current position in the string
--
-- [1] It can return more than just the pattern text, but for now, 
--     this explanation is Good Enough.
-- ----------------------------------------------------------------

lpeg     = require "lpeg"
local Cp = lpeg.Cp
local C  = lpeg.C

-- -------------------------------------------------------------------------
-- Try to match against a list of languages.  Becuase of the way LPeg works,
-- the search will first try "Haskell", then "Lisp", then the next one.  In
-- this example, it's best to search for longer terms before shorter ones. 
-- If you had "Java" then "Javascript", then searching for "Javascript" will
-- return "Java", since "Java" will be found first.  To avoid this, use the
-- order "Javascript", "Java".
--
-- It's this reason that "C" is searched for last in the list.
--
-- This will also return the position just past the match so we can resume
-- searching the string past what we've matched.
-- -------------------------------------------------------------------------

lang = (
         C("Haskell")
       + C("Lisp")        
       + C("Fortran")
       + C("BPCL")
       + C("Assembler")
       + C("Forth")
       + C("Lua")
       + C("Scheme")
       + C("Perl")
       + C("CHILL")
       + C("TeX")
       + C("Javascript")
       + C("Java")
       + C("C")
     ) * Cp()

test = "CHaskellLispFortranBPCLAssemblerForthLuaSchemePerlCHILLTeXJavaJavascript"

-- -------------------------------------------------------------------------
-- Start at the first position in the string (remember: Lua is 1-based). Get
-- the language at that position, plus the position past the language name. 
-- Print the language, then resume searching for other language names.
-- -------------------------------------------------------------------------

pos = 1
while pos <= #test do
  local name,newpos = lang:match(test,pos)
  if not name then break end
  print(name)
  pos = newpos
end

  -spc