lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Feb 9, 2018 at 5:18 PM, Paul K <paul@zerobrane.com> wrote:
>> Indeed, the trivial little state machine has ALWAYS been my preferred
> way to solve this class of problem.
>
> One case it doesn't handle is the mix of ' and " in the string (as the
> string should only be closed with the same quote is was opened with,
> assuming both " and ' are allowed).
>
> This code should handle this case (also added examples):
>
> local list = {
>   '123,"ABC, DEV 23",345,,202,NAME',
>   ',"ABC, DEV 23",345,534,202,NAME',
>   '123,"ABC, DEV 23",345,534.202,',
>   '123 , "ABC, DEV 23" , 345 , 534 , 202 , NAME',
>   [[123,"ABC, \"he,llo\", DEV 23",,,,NAME]],
>   [[123,"ABC, \\",llo\", DEV 23,,,,NAME]],
>   [[123,'ABC, "hel,lo", DEV 23',342,534,202,NAME]],
>   [[123,"ABC, 'hel,lo, DEV 23",342,534,202,NAME]],
> }
> local escaped, instring, pos, start = false, false, -1, ""
> for _,s in ipairs(list) do
>   s = s:gsub([[()(["',\\])]], function(p, s)
>       if p > pos+1 then escaped = false end
>       if s == '\\' then escaped, pos = not escaped, p end
>       if (s == '"' or s == "'") and not escaped
>       and (not instring or s == start) then instring, start = not
> instring, s end
>       if s == ',' and not instring then s = '~' end
>       return s
>     end)
>   print(s)
> end
>
> Paul.
This is great, thanks Paul. I'll have to study the match pattern.

> On Fri, Feb 9, 2018 at 5:02 PM, Coda Highland <chighland@gmail.com> wrote:
>> On Fri, Feb 9, 2018 at 5:40 PM, Paul K <paul@zerobrane.com> wrote:
>>> Hi Russ,
>>>
>>>> The crux of the question is to leave the commas within a quoted items
>>>> and replace all the outer "separator" commas with tilde (~). So parse this:
>>>> 123,"ABC, DEV 23",345,534.202,NAME
>>>> and return this:
>>>> 123~"ABC, DEV 23"~345~534.202~NAME.
>>>
>>> Something like the following may work:
>>>
>>> local list = {
>>>   '123,"ABC, DEV 23",345,,202,NAME',
>>>   ',"ABC, DEV 23",345,534,202,NAME',
>>>   '123,"ABC, DEV 23",345,534.202,',
>>>   '123 , "ABC, DEV 23" , 345 , 534 , 202 , NAME',
>>>   [[123,"ABC, \"he,llo\", DEV 23",,,,NAME]],
>>>   [[123,"ABC, \\",llo\", DEV 23,,,,NAME]],
>>>   [[123,'ABC, "hello", DEV 23',342,534,202,NAME]],
>>> }
>>> local escaped, instring, pos = false, false, -1
>>> for _,s in ipairs(list) do
>>>   s = s:gsub([[()(["',\\])]], function(p, s)
>>>       if p > pos+1 then escaped = false end
>>>       if s == '\\' then escaped, pos = not escaped, p end
>>>       if (s == '"' or s == "'") and not escaped then instring = not instring end
>>>       if s == ',' and not instring then s = '~' end
>>>       return s
>>>     end)
>>>   print(s)
>>> end
>>>
>>> This prints:
>>>
>>> 123~"ABC, DEV 23"~345~~202~NAME
>>> ~"ABC, DEV 23"~345~534~202~NAME
>>> 123~"ABC, DEV 23"~345~534.202~
>>> 123 ~ "ABC, DEV 23" ~ 345 ~ 534 ~ 202 ~ NAME
>>> 123~"ABC, \"he,llo\", DEV 23"~~~~NAME
>>> 123~"ABC, \\"~llo\"~ DEV 23~~~~NAME
>>> 123~'ABC, "hello", DEV 23'~342~534~202~NAME
>>>
>>> Paul.
>>
>> Indeed, the trivial little state machine has ALWAYS been my preferred
>> way to solve this class of problem.
>>
>> /s/ Adam
>>
>