Doing HTTP requests? Always use 'requests' library, not 'urllib'
I wish someone told me that when I started doing web scraping with Python
A few days ago, I decided to read more in detail about proxies. What kind of proxies are there, what is a difference between a residential proxy and datacenter one, etc.
I found an article called “My Problems (and Solutions to) Scraping LOTS of Data” by Zach Burchill.
At some point, Zach says the following:
“For anything using HTTP requests, always use the
requests
Python package instead ofurllib
or any of its descendants.” — Zack
Right after reading this phrase, I had so many flashbacks about trying to debug simple get requests that I did with ‘urllib’.
Web scraping was my first interaction with Python. I never paid attention to which package I am using. When you just begin with Python you can think like ‘urllib’ and ‘requests’ are totally the same. In reality, urllib has a unique talent to break out of the blue.
All of the time the solution was to just make the same request with ‘requests’ package.
So, let it be a no-brainer for you — ditch ‘urllib’ when you need to make HTTP requests.
P.S. ‘urllib’ is quite decent for many other things. Such as encoding and breaking down the structure of URL, for example.
Create your profile
Only paid subscribers can comment on this post
Check your email
For your security, we need to re-authenticate you.
Click the link we sent to , or click here to sign in.