lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I'm working on a grammar in LPeg, but I'm running into an issue where
I need to match whichever is the longest of two patterns, but I
suspect I have a problem that I'm not seeing and I'm not sure how to
fix the issue.  I would like to match lines of text of the following
format:

mydomain.com: some message text here
jnwhiteh@mydomain.com: some message text here

The : is just a leading character and the text following can be either
a hostname or a user@hostname.  I'm using the following definitions:

LETTER      = lpeg.R"az", "AZ"
DIGIT       = lpeg.R"09"
SPECIAL     = lpeg.S";[]\\`_^{|}!"
PERIOD      = lpeg.P"."
triple      = DIGIT * DIGIT^-2
hostaddr    = triple * PERIOD * triple * PERIOD * triple * PERIOD * triple
shortname   = (LETTER + DIGIT) * (LETTER + DIGIT + P"-")^0 * (LETTER + DIGIT)^0
hostname    = shortname * (PERIOD * shortname)^0
host        = hostname + hostaddr
user        = LETTER * (LETTER + DIGIT + SPECIAL)^-8
userathost  = user + P"@" + host
source      = host + userathost
params      = P" :" * (LETTER + DIGIT + SPECIAL + P" ")^-1
line        = P":" * source * params


The problems I see are the following:
 * The string "mydomain.com" is matched by userathost as well as host
 * The string "jnwhiteh@mydomain.com" is partially matched by host and
fully matched by userathost

The second problem causes me the most amount of problem, since the
partial match is selected by the alternation and then the pattern
fails as a whole (if I read it correctly).  Adding an end pattern
P(-1) doesn't seem to help.  The pattern as it stands will catch the
following:

mydomain.com: Hello World

but not this:

jnwhiteh@mydomain.com: Hello World

It's entirely possible that I'm just looking at the problem in the
wrong way due to my lack of familiarity with LPeg.  If anyone can shed
some light, I'd appreciate it!

- Jim