lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi Russ,

> The crux of the question is to leave the commas within a quoted items
> and replace all the outer "separator" commas with tilde (~). So parse this:
> 123,"ABC, DEV 23",345,534.202,NAME
> and return this:
> 123~"ABC, DEV 23"~345~534.202~NAME.

Something like the following may work:

local list = {
  '123,"ABC, DEV 23",345,,202,NAME',
  ',"ABC, DEV 23",345,534,202,NAME',
  '123,"ABC, DEV 23",345,534.202,',
  '123 , "ABC, DEV 23" , 345 , 534 , 202 , NAME',
  [[123,"ABC, \"he,llo\", DEV 23",,,,NAME]],
  [[123,"ABC, \\",llo\", DEV 23,,,,NAME]],
  [[123,'ABC, "hello", DEV 23',342,534,202,NAME]],
}
local escaped, instring, pos = false, false, -1
for _,s in ipairs(list) do
  s = s:gsub([[()(["',\\])]], function(p, s)
      if p > pos+1 then escaped = false end
      if s == '\\' then escaped, pos = not escaped, p end
      if (s == '"' or s == "'") and not escaped then instring = not instring end
      if s == ',' and not instring then s = '~' end
      return s
    end)
  print(s)
end

This prints:

123~"ABC, DEV 23"~345~~202~NAME
~"ABC, DEV 23"~345~534~202~NAME
123~"ABC, DEV 23"~345~534.202~
123 ~ "ABC, DEV 23" ~ 345 ~ 534 ~ 202 ~ NAME
123~"ABC, \"he,llo\", DEV 23"~~~~NAME
123~"ABC, \\"~llo\"~ DEV 23~~~~NAME
123~'ABC, "hello", DEV 23'~342~534~202~NAME

Paul.

On Fri, Feb 9, 2018 at 3:08 PM, Russell Haley <russ.haley@gmail.com> wrote:
> Since my match and capture understanding in Lua is somewhat weak I am
> looking for opportunities to improve my understanding. There is a SO
> question about parsing a file here:
>
> https://unix.stackexchange.com/questions/422526/remove-comma-outside-quotes
>
> The crux of the question is to leave the commas within a quoted items
> and replace all the outer "separator" commas with tilde (~). So parse
> this:
>
> 123,"ABC, DEV 23",345,534.202,NAME
>
> and return this:
>
> 123~"ABC, DEV 23"~345~534.202~NAME.
>
> I've put together a couple of pieces but can't find any way to make it
> into an answer without resorting to loops.
>
> s = '123,"ABC, DEV 23",345,534.202,NAME'
>
> print(s:match('".*,.*"'))
> "ABC, DEV 23"
>
> print(s:gsub('".*(,).*"','~'))
> 123,~,345,534.202,NAME  1
>
>
> >From what i can tell, there is no way to add exclusions to patterns so
> at this point I'm stumped. Instead of asking for the solution and
> studying it, I'd like to first ask for a hint from the mailing list.
> My questions are:
>
> - Is it possible to do this in a single call to gsub (I'm hoping yes)?
> If not, I will look first at one or two calls (i.e. match and then
> gsub) and using a loop.
> - Is this something that would be better done with LPEG?
>
> I'll likely ask is people can share possible answers later to see if
> my solution is at all reasonable.
>
> Thanks
>
> Russ
>