[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Regular expression for matching lines
- From: Eike Decker <eike@...>
- Date: Mon, 30 Jul 2007 00:39:59 +0200
Zitat von Shmuel Zeigerman <shmuz@actcom.co.il>:
> Aaron Brown wrote:
> > One problem with Shmuel's solution quoted above is that it
> > treats the first of the above two strings as four lines, the
> > last being empty. If you want to treat both of them as
> > three lines, you need to do something like this:
> >
> > if string.sub(Str, -1) ~= "\n" then
> > -- The last line doesn't have an EOL; give it one:
> > Str = Str .. "\n"
> > end
> > for Line in Str:gmatch("([^\n]*)\n") do
> > print("line: <" .. Line .. ">")
> > end
> >
> > or this (doesn't side-effect Str):
> >
> > local MissingEol = false
> > if string.sub(Str, -1) ~= "\n" then
> > -- The last line doesn't have an EOL; give it one:
> > MissingEol = true
> > end
> > for Line in (MissingEol and Str .. "\n" or Str):gmatch("([^\n]*)\n") do
> > print("line: " .. Line)
> > end
> >
> > This is impossible to do with just a single pattern and no
> > additional checks, but using string.sub avoids a linear scan
> > of the string. (In other words, don't use Str:match("\n$").)
>
> Nice examples and explanation, though the original poster said
> explicitly he wanted "an expression that needs no additional checks".
>
> Regarding the "impossible to do with just a single pattern", yes, with
> Lua regex, but it IS possible with Lrexlib PCRE binding:
>
> for line in rex.gmatch (str, "^.*", "m")
> do print(line)
> end
>
> (the 3-rd argument "m" stands for "multiline" PCRE option)
>
>
> --
> Shmuel
>
Yes, thanks for your proposals and your explanations... it shows that it isn't
as simple as it looked at first. I also had implemented versions by appending a
linebreak at the end (which creates a new string, including a new hash etc.) or
testing the cases. I really wondered if there exists a pure lua regular
expression that solves the problem elegantly without additional checking using
a single loop.
I intended to write a small tutorial on (lua) regular expressions, explaining a
few samples, this is why I was looking for an "optimal" solution to this
problem, but I doubt now more, that there exists one. It would be really useful
if the C string API of lua would support a :split(luaregex) function - I think
that it would solve the problem, and there are situations where a splitting
function could be handy. Of course, one could write a lua function that uses
the find function, but I believe that it would be a nice addition to the lua
string library functions.
Thanks for your effort!
Eike