[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Fuzzy search
- From: Peter Odding <xolox@...>
- Date: Fri, 04 Jan 2008 18:12:46 +0100
I recently needed the Levenshtein algorithm in Lua:
function levenshtein(s, t)
local s, t = tostring(s), tostring(t)
if type(s) == 'string' and type(t) == 'string' then
local m, n, d = #s, #t, {}
for i = 0, m do d[i] = { [0] = i } end
for j = 1, n do d[0][j] = j end
for i = 1, m do
for j = 1, n do
local cost = s:sub(i,i) == t:sub(j,j) and 0 or 1
d[i][j] = math.min(d[i-1][j]+1, d[i][j-1]+1, d[i-1][j-1]+cost)
end
end
return d[m][n]
end
end
Very inefficient but it does the job:
Lua 5.1.2 Copyright (C) 1994-2007 Lua.org, PUC-Rio
> = levenshtein('referrer', 'referrer') -- zero distance
0
> = levenshtein('referrer', 'referer') -- distance of one character
1
> = levenshtein('random', 'strings') -- random big distance
6
Petite Abeille wrote:
On Jan 4, 2008, at 5:36 PM, Aaron Brown wrote:
Jeff Wise wrote:
Is there a "fuzzy search" function/algorithm available
which would show these strings as "close to being equal"?
There's a technique called stemming that does at least some
of what you want:
http://en.wikipedia.org/wiki/Stemming
Another one of interest:
http://en.wikipedia.org/wiki/Levenshtein_distance