Python之多进程(python爬虫多进程)

网友投稿 297 2022-08-24


Python之多进程(python爬虫多进程)

1.最近比较喜欢的用法:

from concurrent.futures import ProcessPoolExecutor, as_completedex = ProcessPoolExecutor(max_workers=50)tasks = [ex.submit(process_function, file) for file in file_list]results = []finish_num = 0start_time = time.time()for future in as_completed(tasks): results.append(future.result()) used_time = time.time() - start_time finish_num += 1 if finish_num % 100 == 0: print('{}/{}, used {:.3}h, {:.3}s/line, ETS:{:.3}h'.format(finish_num, len(file_list), used_time / 3600, used_time / finish_num, used_time / finish_num * ( len(file_list) - finish_num) / 3600))

2.Pool的用法

#!/usr/bin/env python# -*- coding: utf-8 -*-'''@author: Shiyu Huang@contact: huangsy13@gmail.com@file: asy_test.py'''import numpy as npimport multiprocessingfrom multiprocessing import Process, Poolimport timedef test_func(x): x = x+1 print x time.sleep(3) print 'end' return xif __name__ == '__main__': test_size = 20 number = np.zeros(test_size).tolist() for i in range(len(number)): number[i] = i pool = Pool(processes=multiprocessing.cpu_count()) # for i in range(test_size): # pool.apply_async(test_func, (number[i])) resultList = pool.map(test_func,number) pool.close() pool.join() print

黄世宇/Shiyu Huang's Personal Page:​​https://huangshiyu13.github.io/​​


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:Java中的ArrayList类常用方法和遍历
下一篇:Python之json(PYthon)
相关文章

 发表评论

暂时没有评论,来抢沙发吧~