Friday, 14 June 2013

Python Scipt for Automated Download!

Let's say you have a text file, filled with the links from where you want to download pdf. Well a non-geeky way would be to copy each link into your browser window and download the file but what if the text file containing links has 100000+ links!! Well sounds time consuming, or if we settle for a mediocre value lets say 50 links from where you want to download the respective pdf file, so a mid-level linux user will tell you to fetch those pdf by using the following command

     curl --remote-name URL

But again it is also a very LAME! way to solve things, well here's a python script to download pdf using CURL only! So you guys must be confused PYTHON & CURL ! in the same domain, yes it is possible thanks to 'os' library of Python. Recently Satyakaam Goswami on IRC posted a link of google docs containing url of the pdf to be downloaded. He gave us strict instruction to generate a script for download. So what I did was I chose python, and created the following program to extract the pdf from a given set of links, moreover I also copied the link to a file and saved it on my directory (upon which I was working). Now all I need to do is execute the following code

import os
import time
fo=open("data.txt","r")
print "Name of the file:", fo.name
arp=fo.readline()
for x in range(1,67):
    line=fo.readline()
    os.system('curl --remote-name'+line+'')
   


in place of 67 you can also take EOF as a parameter, since I copied the link from an excel file I was able to see that there were 67 links pertaining to the ROW NUMBER.


And when I typed those magic words
python *FILEnAME*.py

VOILA!!!
All files downloaded! BINGO!!

Feel free to comment or criticize or give suggestion over this problem which was solved by few lines of python code!
                                                    


 

No comments:

Post a Comment