Let's say you have a text file, filled with the links from where you want to download pdf. Well a non-geeky way would be to copy each link into your browser window and download the file but what if the text file containing links has 100000+ links!! Well sounds time consuming, or if we settle for a mediocre value lets say 50 links from where you want to download the respective pdf file, so a mid-level linux user will tell you to fetch those pdf by using the following command
curl --remote-name URL
But again it is also a very LAME! way to solve things, well here's a python script to download pdf using CURL only! So you guys must be confused PYTHON & CURL ! in the same domain, yes it is possible thanks to 'os' library of Python. Recently Satyakaam Goswami on IRC posted a link of google docs containing url of the pdf to be downloaded. He gave us strict instruction to generate a script for download. So what I did was I chose python, and created the following program to extract the pdf from a given set of links, moreover I also copied the link to a file and saved it on my directory (upon which I was working). Now all I need to do is execute the following code
import os
import time
fo=open("data.txt","r")
print "Name of the file:", fo.name
arp=fo.readline()
for x in range(1,67):
line=fo.readline()
os.system('curl --remote-name'+line+'')
in place of 67 you can also take EOF as a parameter, since I copied the link from an excel file I was able to see that there were 67 links pertaining to the ROW NUMBER.
And when I typed those magic words
python *FILEnAME*.py
VOILA!!!
All files downloaded! BINGO!!
Feel free to comment or criticize or give suggestion over this problem which was solved by few lines of python code!
curl --remote-name URL
But again it is also a very LAME! way to solve things, well here's a python script to download pdf using CURL only! So you guys must be confused PYTHON & CURL ! in the same domain, yes it is possible thanks to 'os' library of Python. Recently Satyakaam Goswami on IRC posted a link of google docs containing url of the pdf to be downloaded. He gave us strict instruction to generate a script for download. So what I did was I chose python, and created the following program to extract the pdf from a given set of links, moreover I also copied the link to a file and saved it on my directory (upon which I was working). Now all I need to do is execute the following code
import os
import time
fo=open("data.txt","r")
print "Name of the file:", fo.name
arp=fo.readline()
for x in range(1,67):
line=fo.readline()
os.system('curl --remote-name'+line+'')
in place of 67 you can also take EOF as a parameter, since I copied the link from an excel file I was able to see that there were 67 links pertaining to the ROW NUMBER.
And when I typed those magic words
python *FILEnAME*.py
VOILA!!!
All files downloaded! BINGO!!
Feel free to comment or criticize or give suggestion over this problem which was solved by few lines of python code!
No comments:
Post a Comment