lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Francisco Olarte <folarte@peoplecall.com> [13-10-27 14:20]:
> Hi:
> 
> > is there anything more sophisticated than linear search, which I can
> > do to match a key (string) in a table from a user input, which is only
> > a part of that key?
> 
> There are lots of things you can do, but you should define it a little better.
> 
> > (the whole thing should run on a embedded system under Gentoo Linux.
> > There are 400MB of mem...not Gigs of it ... ;)))
> 
> That is irrelevant if you do not post how many keys are there in the
> table and also which caractereistics do they have, things like wheter
> the keys share common prefixes ( I suppose they do otherwise you'll be
> limited to a trivial 256 maximum different keys ), and how many keys
> there are.
> 
> You can use a trie, as suggested by a previous post, but this is gonne
> be big if implemented under native lua.
> 
> For the general problem I would use an index table with all the keys
> sorted. As lua shares strings this will only cost you memory for the
> array part of the table, a pointer per key. Then you take the user
> string and do a binary search for it. If binary search finds it, then
> your starting index is the previous slot, if it doesn't then it is the
> insertion point. Then you loop in the table, and while the entry
> starts with the user string you have keys with start with it. An
> example.
> 
> sorted keys = 1:'', 2:a, 3:aa, 4:ab, 5:c, 6:x, 7:z11 8:z12 9:z21
> 
> If user input is a, binary search finds  it at 2, then you start
> looping and get a, aa, ab and stop when reading c.
> 
> If user input is aa, binary search finds  it at 3, then you start
> looping and get aa and stop when reading ab.
> 
> If user input is z, binary search tells you it should be inserted at
> 7, and you extract z11, z12, z21
> 
> If user input is y, binary search tells you it should be inserted at 7
> too, but you will stop at z11 without using any key.
> 
> Be careful when implementing binary search, remember to make it easy
> you tipically start with low=0, high=count+1, not with 1, count as
> some textbooks do.
> 
> Also, if you have few keys you could just loop, but as you want to
> avoid linear search I supose it is not appropiate.
> 
> Francisco Olarte.
> 

Hi Francisco,

Thank you for your help! :)

Parts of the information about the contents of the data are not
public available (like "closed source"), so I do have to be brief.

As I said, the size of the data storage has to be considered, as this 
is an embedded system with 512 MB. Means: If there are two algorith
with the same result: Take the one with a smaller memory footprint.

There are roughly about 175000 keys, which should be searchable with
substrings which not necessarily start at the beginning of the
original key. That is: Key> Abrakadabra Search> kada

I will see, what road to go...

Best regards,
mcc