python_pickle(序列号和反序列化实例)定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,利用pickle模块将函数wordcount保存到文件中

网友投稿 360 2022-08-30


python_pickle(序列号和反序列化实例)定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,利用pickle模块将函数wordcount保存到文件中

文章目录

​​problem:​​​​result:​​​​code​​

problem:

首先,定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,函数原型如下:

其中w和txtfile均为字符串。 其次,在存放本次实验材料的文件夹中,利用os.mkdir()创建一个新的文件夹,取名“mydir”;同时,自动识别出以“news_”开头的所有文本文件,将其移动至新建的文件目录“mydir”中(注:需编程自动实现移动文件)。 进一步,利用pickle模块将函数wordcount以及识别出的以“news_”开头的所有文本文件名组合成一个列表,永久保存至文件“wc.pkl”,并存储在文件夹“mydir”中。 最后,再次利用pickle模块将保存在“wc.pkl”中的列表数据载入,获得函数wordcount,并调用wordcount计算四个关键字“中国”、“美国”、“科技”和“芯片”在以“news_”开头的所有文本文件中出现的次数,打印输出,格式参考如下

result:

code

import osimport shutilpath_src = path_string_fixpath_des = path_string_fix+"mydir/"""" create the dir mydir in the proper source path """if not os.path.exists(path_des): os.mkdir(path_src+"mydir")""" get the files in the path: """files_list = os.listdir(path_src)""" get the files start with news_: """file_news_list=os.listdir(path_des)[:2]""" move files from source path to destination path:"""def move_news(): for file_name in files_list: if file_name.startswith("news_"): # print(file_name) shutil.move(path_src+file_name,path_des)""" count the word in specified file """def wordcount(w,txt_file): """the frequency of appearance of word w in the file txt_file(attention ,the txt_file use the absolute path) !attention2:the function read files which is encode in gbk,so the open() use the encoding="gbk"(gb18030 is ok too) to read it correctly Args: w (str): [description] txt_file (str): [absolute path] """ # list=[] string="" with open(txt_file,"r",encoding='gbk') as file_input_stream: string= file_input_stream.read() # print(string) return string.count(w)# print(wordcount("t",path_src+"log.txt"))""" use(experience the serialize module pickle too store(dump) and use the object serialized:) """def pickle_deal(): # obj_list=obj_list with open(path_des+"wc.pkl","wb") as file_output_stream: pickle.dump((wordcount,file_news_list),file_output_stream) with open(path_des+"wc.pkl","rb") as file_input_stream: return pickle.load(file_input_stream)# print(obj_list)def print_head(word_list): #to format the head print: for i in [""]+word_list: print(i.center(20),end="") print()def print_result(word_list): print_head(word_list) for file in obj_list[1]: print(file.center(20),end="") file_full_path=path_des+file for word in word_list: frequency=0 frequency=wordcount(word,file_full_path) frequency=str(frequency).center(20) print(frequency,end="") print() word_list=["中国","美国","科技","芯片"]move_news()obj_list=pickle_deal()"get the function from pickled file"wordcount=obj_list[0]print_result(word_list)


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:冒泡排序完整代码讲解(冒泡排序的程序代码)
下一篇:python画一个可爱的皮卡丘(完整代码)(用代码画皮卡丘)
相关文章

 发表评论

暂时没有评论,来抢沙发吧~