爬虫学习(9):正则爬取jk妹子头像,不要滑走!

网友投稿 294 2022-08-30


爬虫学习(9):正则爬取jk妹子头像,不要滑走!

import requestsimport reimport urllib.requestimport timeimport osheader={ 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36'}url=' r'

.*?src="(.*?)".*?
',re.S)items = re.findall(pattern, c)# print(items)os.makedirs('E://photo/',exist_ok=True)for a in items: print(a)for a in items: print("下载图片:"+a) b=a.split('/')[-1] urllib.request.urlretrieve(a,'E://photo/'+str(int(time.time()))+'.jpg') print(a+'.jpg') time.sleep(2)

import requests, reheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0' '; WOW64) AppleWebKit/537.36 ' '(KHTML, like Gecko) Chrome/70.' '0.3538.25 Safari/537.36 Core/1.' '70.3732.400 QQBrowser/10.5.3819.400'}def url_reques(url): return requests.get(url, headers=headers).textdef img_reques(url): return requests.get(url, headers=headers).contentdef main(): for i in range(1, 17): print(f'正在爬取第{i}页') for img_url in re.findall(r'

.*?src="(.*?)".*?
', url_reques(f'+ 37 * i}&count=35&relp=35&cw=1177&ch=705&tsc=ImageBasicHover&datsrc=I&layout=RowBased&mmasync=1&dgState=x*0_y*0_h*0_c*5_i*{1 + 35 * i}_r*{6 * i}&IG=9BB720932F484381A6E28F2ECA3791C6&SFX={i}&iid=images.5530')): with open('./image/' + img_url[38:64] + '.jpg', 'wb') as f: f.write(img_reques(img_url))if __name__ == '__main__': main()


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:六种高效爬虫框架(最高效的python爬虫框架有几个)
下一篇:Mybatis中如何使用sum对字段求和
相关文章

 发表评论

暂时没有评论,来抢沙发吧~