[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: How to extract a word at a given position in a string?
- From: Steve Litt <slitt@...>
- Date: Mon, 14 Mar 2011 14:04:11 -0400
On Monday 14 March 2011 04:14:08 Benoit Germain wrote:
> > -----Original Message-----
> > From: lua-l-bounces@lists.lua.org [mailto:lua-l-bounces@lists.lua.org]
> > On Behalf Of Roberto Ierusalimschy
> > Sent: Thursday, March 10, 2011 3:00 PM
> > To: Lua mailing list
> > Subject: Re: How to extract a word at a given position in a string?
> >
> > What is wrong with reversing the string ('string.reverse')?
> >
> > -- Roberto
>
> [BG] Reversing the string is fine when it is rather small (which is my
> case). However, I suppose one can encounter a situation where a large text
> has to be searched, and there is no easy means of extracting a small
> portion to search in. In that case, duplicating the whole text, and
> searching through it, is expensive. I was just wondering if there was a
> possibility to apply all pattern matching functions in the reverse
> direction.
There are probably a million solutions to every problem -- three obvious ones
pop into my mind reading this thread.
1) Go ahead and use the reverse -- it will probably still be "fast enough". To
prevent its appearing to hang when you pass a gigabyte long string, you could
put an assert statement before it that tests the string for less than an
arbitrary maximum you believe will never hit, and in assert's error argument
tell the user "contact the programmer and tell him you hit the reverse string
limit"
2) It's very easy to make a C algorithm callable from Lua
(http://www.troubleshooters.com/codecorn/lua/lua_lua_calls_c.htm), so you get
the efficiency of C's "string is an array of characters" right in your Lua
program.
3) If I understand your problem domain correctly, you have a string and a
numeric subscript, and you need to return the word surrounding that subscript.
For instance, if the string is:
"123 567 89a cde"
and the subscript is 6, then it should return "567"
So to do that, you just have two loops, one going forward and one going back,
repeatedly calling string.sub(mystring, tempsubscript) and testing for word
boundary conditions. When both hit word boundary conditions, you have your
word. I bet it would be reasonably fast too.
SteveT
Steve Litt
Recession Relief Package
http://www.recession-relief.US
Twitter: http://www.twitter.com/stevelitt