Showing posts with label urllib3. Show all posts
Showing posts with label urllib3. Show all posts

Thursday, January 16, 2014

urllib3, the library used by the Python requests library


By Vasudev Ram



While checking out a tool that uses the requests HTTP library for Python, I happened to see that requests itself uses a library called urllib3 internally. (Here is urllib3 on PyPI.)

Since I had requests installed in my Python installation's directory, I searched for filenames like urllib3* in Python's lib/site-packages. Found the module there, in the directory:

requests/packages/urllib3

I also searched the Net and found this article by Kenneth Reitz, creator of the requests library:

Major Progress for Requests

in which he mentions collaborating with the creator of urllib3 to make use of it in requests.

urllib3 seems to have a good set of features, some of which are:

[
Re-use the same socket connection for multiple requests (HTTPConnectionPool and HTTPSConnectionPool) (with optional client-side certificate verification).

File posting (encode_multipart_formdata).

Built-in redirection and retries (optional).

Supports gzip and deflate decoding.

Thread-safe and sanity-safe.

Tested on Python 2.6+ and Python 3.2+, 100% unit test coverage.

Small and easy to understand codebase perfect for extending and building upon. For a more comprehensive solution, have a look at Requests which is also powered by urllib3.
]

So after checking the urllib3 docs a bit, I wrote a small program to test urllib3 by using it to download the home page of my web site, dancingbison.com:

# try_urllib3.py
# A program to try basic usage of the urllib3 Python library.

from requests.packages import urllib3

http = urllib3.PoolManager()
r = http.request('GET', 'http://dancingbison.com/index.html')

print "r.status: ", r.status
print "r.data", r.data

with open("dancingbison_index.html", "w") as out_fil:
    out_fil.write(r.data)

It worked, and downloaded the file index.html.

Interestingly, urllib3 itself uses httplib under the hood. So it's turtles at least 3 levels down ... :-)


- xtopdf: programmable PDF creation for business


Vasudev Ram - Python / open source / Linux training and consulting