lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Apr 20, 2023, 06:17 Rana Ayaz <rana.ayaz1@gmail.com> wrote:
I am a new learner of this language
And I want to know how to get specific text from any website without opening it.
without installing any new libraries
I want to be able to get specific data text from a website using the library that already has it.

Hello! You are mentioning androlua, in case it's something that I created, please be aware that it's a proof of concept and I haven't updated it in years (and don't plan to), just to set expectations straight.

There's no built-in way to retrieve content of web pages in Lua. For that, you usually need a library with http support [1] like luasocket, or something based around curl [2]. You mention that you are a new learner of Lua, but in case you're curious, with AndroLua you can make use of the LuaJava interop to gain access to the Java HTTP client [3].

[1] https://luarocks.org/search?q=http
[2] https://luarocks.org/search?q=curl
[3] https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/HttpClient.html

So what to do to get such text from website

If you are scraping a website, be aware that you are downloading the HTML content, and you need a way to extract the content you are looking for.

Lua has support for patterns (not full regex) [4], but parsing HTML gets "interesting" really fast [5], so you'd want to make use of a specialized HTML parsing library (e.g. [6]). Adding extra libraries to AndroLua is left as an exercise to the reader.

[4] https://www.lua.org/manual/5.4/manual.html#6.4.1
[5] https://stackoverflow.com/a/1732454/221509
[6] https://github.com/msva/lua-htmlparser/blob/master/doc/sample.lua