lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Actually, the problem is not how to download the file. This can be
done as follows:

local http = require('socket.http')

local url = 'http://midia.cmais.com.br/assets/audio/default/CENA_00087___P___24_12_10_1293450448.mp3'

local content, status = http.request(url)
if status == 200 then
local file = io.open('file.mp3', 'wb')
file:write(content)
file:close()
end

The problem is that urls should be searched in html file. In my
example, I used a known url, but to know it a web page need to be
scanned.
I have this url: http://culturafm.cmais.com.br/cena-brasileira/cena-brasileira.
Inside html, I have lots of urls. Some of then are direct links to
audio files. I don't need mime types becose I know links are finished
by ".mp3".
So my problem is which pattern matches urls started with "http" and
ended with ".mp3". After obtaining the list of urls pointing to audio
files, I can download each of then as I have done in the above
example.
Simplifying: if I have the url http://www.a.com/b/c/d.mp3' or
something like that  which lua patterns matches it?
Perhaps, explaining more than necessary, in my first message, I was
confused. Sorry!

2019-12-25 15:12 GMT-03:00, Philippe Verdy <verdy_p@wanadoo.fr>:
> There's no standard for download urls to terminate in .mp3. The standard
> uses mime types when querying urls.
>
> You can query mime types of http(s) urls without downloading them using
> HEAD requests rather than GET.
>
> URLS have a standard for parsing them, which allows distinguishing the
> protocol, the host name or address, a possible port number, a path and a
> query string. All are required but none of them indicate a mime type. An if
> the path part may frequently be used to indicate the mime type, this is not
> required, as the effective mp3 you request may be selected from the query
> string and both may be using randomized encoding defined by the server and
> possibly depending on user's session, i.e.cookies or additional parameters
> in query strings or in encoded form data submited outside the url, such as
> authentication parameters or user preferences set by form input variables
> (possibly hidden). Each web site then defines its own encodings and API for
> path and query strings as well as form data.
>
> When your request will succeed, the download will senf the binary mp3, it's
> mime type, a possible suggested name for storing a file in your local file
> system. The answer may as well return an error status, an html page (such
> as an log-on form, or reason why your request was denied by the server).
>
>
> Le mer. 25 déc. 2019 à 18:02, luciano de souza <luchyanus@gmail.com> a
> écrit :
>
>> Hello all,
>> Cultura FM radio has some interesting audios about classical music.
>> I'd like to download it automatically.
>> The steps are:
>> 1. To get the page with http.request;
>> 2. To match urls started with 'http://' and ended with '.mp3';
>> 3. To record the urls in a file.
>> My problem is the step 2. I could not find a pattern to match urls like:
>>
>> http://midia.cmais.com.br/assets/audio/default/CENA_00087___P___24_12_10_1293450448.mp3
>>
>> Let me show to you my attempt:
>>
>> local http = require('socket.http')
>>
>> local target = '
>> http://culturafm.cmais.com.br/cena-brasileira/cena-brasileira'
>>
>> local content, status = http.request(target)
>>
>> if status == 200 then
>>         local file = io.open('url.txt', 'w')
>>         local pattern = '(http://[a-zA-Z0-9_/]-%.mp3)'
>>         for url in content:gmatch(pattern) do
>>                 file:write(url)
>>         end
>>         file:close()
>> end
>>
>> Would someone know a lua pattern to match urls started with "http://";
>> and ended with '.mp3'?
>> Best regards,
>>
>> --
>> Luciano de Souza
>>
>>
>


-- 
Luciano de Souza