[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: minor LPEG doubt
- From: Sean Conner <sean@...>
- Date: Tue, 18 Sep 2018 11:48:33 -0400
It was thus said that the Great joy mondal once stated:
> Hi Sean !
>
> After reading my mail I think I wasn't clear enough.
>
> How do you deal with situations where you have a matching end character in
> between your match ?
>
> It particularly problematic for characters such as " '
>
> Is there a LPEG general best practice for such cases ?
I don't know about best practices, I tend to use what works for me.
With that said, your example:
Hello Hel "" lo
(that's what it would appear like to LPeg---a stream of characters)
I would ask---what's the rule? What is that situation? Beacuse that's a
situation I haven't had to deal with. About the closest situation I can
think of is something like:
'She said she can't come to the party.'
Here, whether the ' is a string marker or part of a word is context
dependent. You can either require an escape character:
'She said she can\'t come to the party.'
or you can define parsing rules like (in English):
* If a "space" character preceeds a single quote, treat it as a start
of a string.
* If, in a string, you come across a single quote preceeded by an
alphabetic character (A-Z, a-z) and immediately followed by a an
alphabetic characeter, treat it as part of the word (or string).
* If, in a string, you come across a single quote, followed by a
"space" or "punctuation" character, treat it as the end of the
string.
There are other cases you might have to consider:
''Tis but a stratch!'
'We ain't got nuttin' to say, copper!'
Now the question becomes, do you want a context-free parse, or a
context-aware parse? A context-aware parse might have to include quite a
number of exceptions. There is no right or wrong answer here.
-spc