python - Throttling requests with multiple proxies -


i'm assigning random proxies requests via custom middleware. i'd key download throttling specific proxy request using, far can tell, out of box, possible when tied domains or ips. i'm worried implementing pooling logic in proxy middleware cause thread safety issues. has done before? pointers appreciated.

as recommended on scrapy mailing list, there special request meta variable autothrottle middleware obeys, called download_slot - allows programmatic grouping/throttling of requests.

in custom proxy middleware:

self.proxies = get_proxies() #list of proxies proxy_address = random.choice(self.proxies) request.meta['proxy'] = proxy_address request.meta['download_slot'] = hash(proxy_address) % max_concurrent_requests 

i use hash function cheap way bucket requests externally defined limit on requests.


Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -