lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


There's no standard for download urls to terminate in .mp3. The standard uses mime types when querying urls.

You can query mime types of http(s) urls without downloading them using HEAD requests rather than GET.

URLS have a standard for parsing them, which allows distinguishing the protocol, the host name or address, a possible port number, a path and a query string. All are required but none of them indicate a mime type. An if the path part may frequently be used to indicate the mime type, this is not required, as the effective mp3 you request may be selected from the query string and both may be using randomized encoding defined by the server and possibly depending on user's session, i.e.cookies or additional parameters in query strings or in encoded form data submited outside the url, such as authentication parameters or user preferences set by form input variables (possibly hidden). Each web site then defines its own encodings and API for path and query strings as well as form data.

When your request will succeed, the download will senf the binary mp3, it's mime type, a possible suggested name for storing a file in your local file system. The answer may as well return an error status, an html page (such as an log-on form, or reason why your request was denied by the server).


Le mer. 25 déc. 2019 à 18:02, luciano de souza <luchyanus@gmail.com> a écrit :
Hello all,
Cultura FM radio has some interesting audios about classical music.
I'd like to download it automatically.
The steps are:
1. To get the page with http.request;
2. To match urls started with 'http://' and ended with '.mp3';
3. To record the urls in a file.
My problem is the step 2. I could not find a pattern to match urls like:
http://midia.cmais.com.br/assets/audio/default/CENA_00087___P___24_12_10_1293450448.mp3

Let me show to you my attempt:

local http = require('socket.http')

local target = 'http://culturafm.cmais.com.br/cena-brasileira/cena-brasileira'

local content, status = http.request(target)

if status == 200 then
        local file = io.open('url.txt', 'w')
        local pattern = '(http://[a-zA-Z0-9_/]-%.mp3)'
        for url in content:gmatch(pattern) do
                file:write(url)
        end
        file:close()
end

Would someone know a lua pattern to match urls started with "http://"
and ended with '.mp3'?
Best regards,

--
Luciano de Souza