Re: String pattern for chinese word

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: String pattern for chinese word
From: Rena <hyperhacker@...>
Date: Mon, 16 Apr 2012 07:21:32 -0600

On Mon, Apr 16, 2012 at 07:10, Choonster TheMage
<choonster.2010@gmail.com> wrote:
> On Mon, Apr 16, 2012 at 10:55 PM, jiang yu <yu.jiang.163@gmail.com> wrote:
>> Hi all!
>>  s = [[ '\xe8\x8b\xb1\xe9\x9b\x84\xe6\x95\x91\xe7\xbe\x8e','id','哈哈','6' ]]
>>  for i in string.gmatch(s,"'(%w+)'") do
>>    print(i)
>>  end
>> output is:
>> id
>> 6
>>  How can I get the all four words?
>>
>>
>> mos
>>
>
> I believe %w only matches letters of the Latin alphabet and numbers.
> One way to capture all four words would be to use the %b token:
>
>  s = [[ '\xe8\x8b\xb1\xe9\x9b\x84\xe6\x95\x91\xe7\xbe\x8e','id','哈哈','6' ]]
>  for i in string.gmatch(s,"%b''") do
>   i = i:match("'(.+)'")
>   print(i)
>  end
>

The manual specifies that %b requires two *distinct* characters. In
this case you'd want something like: '(.-)'

-- 
Sent from my toaster.

References:
- String pattern for chinese word, jiang yu
- Re: String pattern for chinese word, Choonster TheMage

Prev by Date: Re: String pattern for chinese word
Next by Date: Re: Predicting ipairs (Was: Lua 5.2 Length Operator)
Previous by thread: Re: String pattern for chinese word
Next by thread: Re: String pattern for chinese word
Index(es):
- Date
- Thread