Python,Scrapy,Pipeline:函数“process_item”未被调用(scrapy pipeline没有执行)

网友投稿 542 2022-08-30


Python,Scrapy,Pipeline:函数“process_item”未被调用(scrapy pipeline没有执行)

我有一个非常简单的代码,如下所示。抓取没问题,我可以看到所有生成正确数据的print语句。在Pipeline中,初始化工作正常。但是,process_item函数没有被调用,因为函数开头的print语句从未执行过。在

蜘蛛:comosham.py在

import scrapyfrom scrapy.spider import Spiderfrom scrapy.selector import Selectorfrom scrapy.import Requestfrom activityadvisor.items import ComoShamLocationfrom activityadvisor.items import ComoShamActivityfrom activityadvisor.items import ComoShamRatesimport reclass ComoSham(Spider): name = "comosham" allowed_domains = ["comoshambhala.com"] start_urls = [ " " " " ] def parse(self, response): category = (response.url)[39:44] print 'in parse' if category == 'class': pass """self.gen_req_class(response)""" elif category == 'about': print 'about to call parse_location' self.parse_location(response) elif category == 'rates': pass """self.parse_rates(response)""" else: print 'Cant find appropriate category! check check check!! Am raising Level 5 ALARM - You are a MORON :D' def parse_location(self, response): print 'in parse_location' item = ComoShamLocation() item['category'] = 'location' loc = Selector(response).xpath('((//div[@id = "node-2266"]/div/div/div)[1]/div/div/p//text())').extract() item['address'] = loc[2]+loc[3]+loc[4]+(loc[5])[1:11] item['pin'] = (loc[5])[11:18] item['phone'] = (loc[9])[6:20] item['fax'] = (loc[10])[6:20] item['email'] = loc[12] print item['address'],item['pin'],item['phone'],item['fax'],item['email'] return item

项目文件:

^{pr2}$ 管道文件:

class ComoShamPipeline(object): def __init__(self): self.locationdump = csv.writer(open('./scraped data/ComoSham/ComoshamLocation.csv','wb')) self.locationdump.writerow(['Address','Pin','Phone','Fax','Email']) def process_item(self,item,spider): print 'processing item now' if item['category'] == 'location': print item['address'],item['pin'],item['phone'],item['fax'],item['email'] self.locationdump.writerow([item['address'],item['pin'],item['phone'],item['fax'],item['email']]) else: pass

最终发现主要是以下两个原因 1将以下行添加到py设置! ITEM_PIPELINES = {‘[YOUR_PROJECT_NAME].pipelines.[YOUR_PIPELINE_CLASS]’: 300} 2当你的蜘蛛跑的时候交出物品! yield my_item 我的是第二个原因,设置完后立刻就好了​下载scrapy开发手册


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:Python 3中的json.dumps,会将中文转换为unicode编码后保存(python和java哪个更值得学)
下一篇:Java分支结构程序设计实例详解
相关文章

 发表评论

暂时没有评论,来抢沙发吧~