司开星的博客

Python2 异步网络请求库比较

封装库

grequests

简介

requests作者写的基于gevent的异步请求库。

地址

https://github.com/kennethreitz/grequests

示例

1
2
3
4
5
6
7
8
9
10
11
12
import grequests

urls = [
'http://www.heroku.com',
'http://python-tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://fakedomain/',
'http://kennethreitz.com'
]
rs = (grequests.get(u) for u in urls)
grequests.map(rs)

get方法参数与requests.get()参数相同,当然也支持postputdelete等方法。
请求失败默认返回None,可以给map方法传递。exception_handler参数指定请求失败的回调函数。

map支持的参数如下:

1
2
3
4
5
:param requests: a collection of Request objects.
:param stream: If True, the content will not be downloaded immediately.
:param size: Specifies the number of requests to make at a time. If None, no throttling occurs.
:param exception_handler: Callback function, called when exception occured. Params: Request, Exception
:param gtimeout: Gevent joinall timeout in seconds. (Note: unrelated to requests timeout)

另外如果要希望返回生成器可以使用imap方法。不过imap方法size默认为2,且没有gtimeout参数。

requests-threads

简介

requests作者的另一个异步请求模块,基于Twisted框架。

地址

https://github.com/requests/requests-threads

示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from twisted.internet.defer import inlineCallbacks
from twisted.internet.task import react
from requests_threads import AsyncSession

session = AsyncSession(n=100)

@inlineCallbacks
def main(reactor):
responses = []
for i in range(100):
responses.append(session.get('http://httpbin.org/get'))

for response in responses:
r = yield response
print(r)

if __name__ == '__main__':
react(main)

txrequests

简介

同样是基于twisted框架的异步requests模块。

地址

https://github.com/tardyp/txrequests

示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
from txrequests import Session
from twisted.internet import defer

@defer.inlineCallbacks
def main():
# use with statement to cleanup session's threadpool, and connectionpool after use
# you can also use session.close() if want to use session for long term use
with Session() as session:
# first request is started in background
d1 = session.get('http://httpbin.org/get')
# second requests is started immediately
d2 = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = yield d1
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = yield d2
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)

treq

简介

基于twisted的异步网络请求库,致力于提供与requests类似的接口。此模块文档较完善。

地址

https://github.com/twisted/treq

示例

1
2
3
4
def main(reactor, *args):
d = treq.get('http://httpbin.org/get')
d.addCallback(print_response)
return d

trip

简介

基于tornado的异步网络请求库。

地址

https://github.com/littlecodersh/trip

示例

1
2
3
4
5
6
7
import trip

def main():
r = yield trip.get('https://httpbin.org/get', auth=('user', 'pass'))
print(r.content)

trip.run(main)

底层库

gevent

简介

Python2中最常见的异步库。

地址

https://github.com/gevent/gevent

示例

1
2
3
4
5
6
import gevent
from gevent import socket
urls = ['www.google.com', 'www.example.com', 'www.python.org']
jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls
gevent.joinall(jobs, timeout=2)
[job.value for job in jobs]

自动将网络请求转换为异步模式的monkey patch:

1
2
3
from gevent import monkey
monkey.patch_socket()
import urllib2

twisted

pass

tornado

pass

待续