[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: strip_tags - HTML tag stripper
- From: "Jim Whitehead II" <jnwhiteh@...>
- Date: Tue, 22 Apr 2008 00:52:34 -0700
On Mon, Apr 21, 2008 at 3:05 AM, Bertrand Mansion <email@example.com> wrote:
> Le 21 avr. 08 à 01:06, Jim Whitehead II a écrit :
> > I currently have the need to strip HTML tags from a given Lua string,
> > ideally allowing a specific subset (such as <p>, <b>, etc.). There
> > are a number of implementations of this, a PHP version in particular:
> > http://uk2.php.net/strip_tags
> > Does anyone have something like this in Lua, or some example LPEG code
> > for a specific tag that I could use? A naive solution is relatively
> > simple using patterns matching, but I'd like to be able to handle odd
> > cases like this:
> > <a href="blah" onClick="<script src='foo'></script>">Link</a>
> > I'd like to avoid stripping the <script> tag in this case, since it
> > occurs as an attribute of another tag.
> Either you strip tags or you don't. Since <script> is inside <a>, if you
> strip <a>, you strip <script> at the same time.
Actually, that isn't the case. Using an XML parser you can absolutely
strip one and not the other, because the "tag" inside the attribute
isn't a tag at all. With a proper ruleset you can actually distill
things down to a point where you have what you need.