• Subject: Re: Finding intermediate token in a string
• From: Sean Conner <sean@...>
• Date: Mon, 7 Jul 2014 04:06:50 -0400

```It was thus said that the Great Austin Einter once stated:
> Hi All
> I can have an input string with any one of below formats
>
> \r\nContent-Length: 100 \r\n
> \r\nContent-Length    :      100 \r\n
> \r\nContent-Length:100    \r\n
>
> Input string can be of any one value from above examples
>
> My aim is to extract 100 as an integer.
>
> What is the best way to do it.

In my opinion, use LPeg for parsing.  Yes, there is some learning curve,
but for what you are trying to do, there isn't anything better than LPeg.  I
use it at work all the time for parsing issues.

Here's an example that applies to your case:

local lpeg = require "lpeg"

local P  = lpeg.P
local S  = lpeg.S
local C  = lpeg.C

local crlf    = P"\r"^-1 * P"\n"
local lwsp    = S" \t"
local eoh     = (crlf * #crlf) + (crlf - (crlf^-1 * lwsp))
local lws     = (crlf^-1 * lwsp)^0
local value   = (P(1) - eoh)^0
/ function(v)
return v:gsub("[%s%c]+"," ")
end
local name    = C((P(1) - (P":" + crlf + lwsp))^1)
local header  = name * lws * ":" * lws * value * eoh

hdr,val = header:match "Content-Length: 100 \r\n"               print(hdr,val)
hdr,val = header:match "Content-Length    :      100 \r\n"      print(hdr,val)
hdr,val = header:match "Content-Length:100    \r\n"             print(hdr,val)
some
value here

]] print(hdr,val)

You might also want to look at RFC-5322 as that describes the general
format for Internet message formats (which is what you're parsing) as that
covers the general format for headers.

-spc

```