lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great joy mondal once stated:
> Hi Sean !
> 
> After reading my mail I think I wasn't clear enough.
> 
> How do you deal with situations where you have a matching end character in
> between your match ?
> 
> It particularly problematic for characters such as  " '
> 
> Is there a LPEG general best practice for such cases ?

  I don't know about best practices, I tend to use what works for me.

  With that said, your example:

	Hello Hel "" lo

(that's what it would appear like to LPeg---a stream of characters)

I would ask---what's the rule?  What is that situation?  Beacuse that's a
situation I haven't had to deal with.  About the closest situation I can
think of is something like:

	'She said she can't come to the party.'

  Here, whether the ' is a string marker or part of a word is context
dependent.  You can either require an escape character:

	'She said she can\'t come to the party.'

or you can define parsing rules like (in English):

	* If a "space" character preceeds a single quote, treat it as a start
	  of a string.  

	* If, in a string, you come across a single quote preceeded by an
	  alphabetic character (A-Z, a-z) and immediately followed by a an
	  alphabetic characeter, treat it as part of the word (or string).

	* If, in a string, you come across a single quote, followed by a
	  "space" or "punctuation" character, treat it as the end of the
	  string.

  There are other cases you might have to consider:

	''Tis but a stratch!'

	'We ain't got nuttin' to say, copper!'

  Now the question becomes, do you want a context-free parse, or a
context-aware parse?  A context-aware parse might have to include quite a
number of exceptions.  There is no right or wrong answer here.

  -spc