[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: String pattern for chinese word
- From: Rena <hyperhacker@...>
- Date: Mon, 16 Apr 2012 07:21:32 -0600
On Mon, Apr 16, 2012 at 07:10, Choonster TheMage
<choonster.2010@gmail.com> wrote:
> On Mon, Apr 16, 2012 at 10:55 PM, jiang yu <yu.jiang.163@gmail.com> wrote:
>> Hi all!
>> s = [[ '\xe8\x8b\xb1\xe9\x9b\x84\xe6\x95\x91\xe7\xbe\x8e','id','哈哈','6' ]]
>> for i in string.gmatch(s,"'(%w+)'") do
>> print(i)
>> end
>> output is:
>> id
>> 6
>> How can I get the all four words?
>>
>>
>> mos
>>
>
> I believe %w only matches letters of the Latin alphabet and numbers.
> One way to capture all four words would be to use the %b token:
>
> s = [[ '\xe8\x8b\xb1\xe9\x9b\x84\xe6\x95\x91\xe7\xbe\x8e','id','哈哈','6' ]]
> for i in string.gmatch(s,"%b''") do
> i = i:match("'(.+)'")
> print(i)
> end
>
The manual specifies that %b requires two *distinct* characters. In
this case you'd want something like: '(.-)'
--
Sent from my toaster.