Python 爬虫 爬取爱奇艺VIP视频(python和java哪个更值得学)

网友投稿 582 2022-08-22


Python 爬虫 爬取爱奇艺VIP视频(python和java哪个更值得学)

一、第三方库

requests >>> pip install requests   发送请求 访问网站

tqdm >>> pip install tqdm    进度条 模块

二、开发环境

版 本: python  3.8

编辑器:pycharm 2021.2

三、模块安装问题

win + R 输入cmd 输入安装命令 pip install 模块名 (如果你觉得安装速度比较慢, 你可以切换国内镜像源)

模块安装问题:

- 如何安装python第三方模块:

- 安装失败原因:

- 失败一: pip 不是内部命令

解决方法: 设置环境变量

- 失败二: 出现大量报红 (read time out)

解决方法: 因为是网络链接超时,  需要切换镜像源

清华:install -i 模块名

- 失败三: cmd里面显示已经安装过了, 或者安装成功了, 但是在pycharm里面还是无法导入

解决方法: 可能安装了多个python版本 (anaconda 或者 python 安装一个即可) 卸载一个就好

或者你pycharm里面python解释器没有设置好

四、配置pycharm里面的python解释器

1. 选择file(文件) >>> setting(设置) >>> Project(项目) >>> python interpreter(python解释器)

3. 添加python安装路径

五、pycharm如何安装插件

1. 选择file(文件) >>> setting(设置) >>> Plugins(插件)

六、爬虫基本思路

爬视频

m3u8: 视频流格式

ts片段 网站链接 总和 m3u8 网站链接(所有的ts片段链接)

省流

mp4  访问一个网站 视频网站

解放 服务器压力

实现一个视频爬虫

分析数据来源(m3u8网站链接)

​​发送请求 (访问网站)

2. 获取数据

3. 解析数据

七、完整代码

import requestsimport refrom tqdm import tqdmheaders = { 'cookie': 'QC005=fb211523bdc556b600a53cb72de24305; QC006=e0mhjuh843mffyx4kqdsf1po; QP0030=1; TQC030=1; T00404=d229739aacf304df0bbde71c6736c979; QC173=0; QP0034=%7B%22v%22%3A1%2C%22dm%22%3A%7B%22wv%22%3A1%7D%2C%22m%22%3A%7B%22wm-vp9%22%3A1%2C%22wm-av1%22%3A1%7D%7D; QC008=1658151456.1658151456.1659015716.2; nu=0; P00004=.1659015719.b9ba4b25bc; QC160=%7B%22type%22%3A2%2C%22conformLoginType%22%3A0%7D; QY_PUSHMSG_ID=fb211523bdc556b600a53cb72de24305; QYABEX={"mergedAbtest":"4269_B,3075_A,4580_A,1550_B,1707_B","PCW_1_LoginCash":{"value":"1","abtest":"4269_B"},"PCW_1_new_player":{"value":"0","abtest":"3075_A"},"PCW_1_qyhome_recommend_sources":{"value":"0","abtest":"4580_A"},"pcw_home_hover":{"value":"1","abtest":"1550_B"},"PCW-Home-List":{"value":"1","abtest":"1707_B"}}; QP0033=1; T00700=EgcI9L-tIRABEgcI58DtIRABEgcIq8HtIRABEgcIrcHtIRAB; QP0037=60; P00001=05fA3TTaBsyaafH2gNMCU7rFlsEK6qA9zeYPH8bDQN9auzFUsVMkYEfSm2Em1CTE4oim3b7; P00007=05fA3TTaBsyaafH2gNMCU7rFlsEK6qA9zeYPH8bDQN9auzFUsVMkYEfSm2Em1CTE4oim3b7; P00003=1637120337; P00002=%7B%22uid%22%3A1637120337%2C%22pru%22%3A1637120337%2C%22user_name%22%3A%22199****7649%22%2C%22nickname%22%3A%22%5Cu5bcc%5Cu58eb%5Cu5c71%5Cu4e0b2010duo%22%2C%22pnickname%22%3A%22%5Cu5bcc%5Cu58eb%5Cu5c71%5Cu4e0b2010duo%22%2C%22type%22%3A11%2C%22email%22%3A%22%22%7D; P00010=1637120337; P01010=1659024000; P00PRU=1637120337; QC170=1; QC179=%7B%22vipTypes%22%3A%2216%22%2C%22userIcon%22%3A%22%2F%2Fimg7.iqiyipic.com%2Fpassport%2F20200101%2F90%2F90%2Fpassport_1637120337_157780421165796_130_130.jpg%22%2C%22iconPendant%22%3A%22%22%2C%22uid%22%3A1637120337%2C%22bannedVip%22%3Afalse%2C%22allVip%22%3Atrue%7D; QC175=%7B%22upd%22%3Atrue%2C%22ct%22%3A1659016055538%7D; QP0013=16; QC163=1; QP0027=5; __dfp=a1691ca7d5a6964b49995377607b0302249996fd6c8dca1ebc59539a4f410e402d@1659447456189@1658151457189; QY00001=1637120337; QP0025=1; QP0035=5; QP0036=2022728%7C80.672; QC007=QC010=160607417; IMS=IggQABj_5IqXBioqCiA4ODgxNzY2YTAyOWZlMzc2ZDBhNDRkMzQzNGZiOTM1NBAAIgAoSjAFciQKIDg4ODE3NjZhMDI5ZmUzNzZkMGE0NGQzNDM0ZmI5MzU0EACCAQCKASQKIgogODg4MTc2NmEwMjlmZTM3NmQwYTQ0ZDM0MzRmYjkzNTQ; QC159=%7B%22color%22%3A%22FFFFFF%22%2C%22channelConfig%22%3A0%2C%22hideRoleTip%22%3A1%2C%22isOpen%22%3A1%2C%22speed%22%3A10%2C%22density%22%3A40%2C%22opacity%22%3A86%2C%22isFilterColorFont%22%3A1%2C%22isOpenMask%22%3A0%2C%22proofShield%22%3A0%2C%22forcedFontSize%22%3A24%2C%22isFilterImage%22%3A1%2C%22defaultSwitch%22%3A0%2C%22hadTip%22%3A1%2C%22clickRole%22%3A0%7D', 'origin': ' 'referer': ' 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36',}url = '= requests.get(url=url, headers=headers)json_data = response.json()m3u8 = json_data['data']['program']['video'][1]['m3u8']ts_list = re.sub('#E.*', '', m3u8)ts_list = ts_list.split()for ts in tqdm(ts_list): ts_data = requests.get(ts).content with open('远山淡影.mp4', mode='ab') as f: f.write(ts_data)


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:springcloud feign 接口指定接口服务ip方式
下一篇:Python 输入/输出(python下载安装教程)
相关文章

 发表评论

暂时没有评论,来抢沙发吧~