python中LightGBM模型以及其他模型的使用(LightGBM模型)

网友投稿 788 2022-09-03


python中LightGBM模型以及其他模型的使用(LightGBM模型)

在做一个练习的过程中可能需要的包如下:

import pandas as pdimport numpyimport warningsfrom sklearn.preprocessing import scalefrom sklearn.model_selection import cross_val_scorefrom sklearn.linear_model import LogisticRegressionfrom sklearn.tree import DecisionTreeClassifierfrom sklearn.svm import SVCfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.ensemble import GradientBoostingClassifierfrom xgboost.sklearn import XGBClassifierimport lightgbm as lgb

1. 划分X和Y

X为数据特征,即feature,Y为target, 即是否逾期的结果。逾期为1,没有逾期为0。

2. 划分特征值和标签值

wxl=表的名称['target']wxl_X=表的名称.drop(columns=['target'])wxl_X=scale(wxl_X,axis=0) #将数据转化为标准数据

3. 将一个大的数据集划分成训练集和测试集

首先需要导入我们需要的panda包和re包和numpy包import pandaimport numpyimport re

#需要导入包from sklearn.model_selection import train_test_split#划分训练集和测试集X_train, X_test, y_train, y_test = train_test_split(feature, target, test_size=0.2)

4. 使用LightGBM模型进行预测以及结果评估

import lightgbm as lgblgb_train = lgb.Dataset(X_train, y_train)lgb_eval = lgb.Dataset(X_test, y_test, reference = lgb_train)#lightgbm模型参数设置,根据自己的需求调一调params = { 'task':'train', 'boosting_type':'gbdt', 'objective':'binary', 'metric':{'12','auc','binary_logloss'}, 'num_leaves':40, 'learning_rate':0.05, 'feature_fraction':0.9, 'bagging_fraction':0.8, 'bagging_freq':5, 'verbose':0, 'is_unbalance':True }#训练参数设置gbm = lgb.train(params,lgb_train,num_boost_round=1000,valid_sets=lgb_eval,early_stopping_rounds=100)#模型预测lgb_pre = gbm.predict(X_test) #括号中需要输入与训练时相同的数据格式#结果评估from sklearn.metrics import roc_auc_scoreauc_score = roc_auc_score(y_test, lgb_pre)#模型保存gbm.save_model('whx19961212.txt')#模型加载import lightgbm as lgbgbm = lgb.Booster(model_file = 'whx19961212.txt')

5. 另外其他各种模型的构建

lr = LogisticRegression(random_state=2018,tol=1e-6) # 逻辑回归模型tree = DecisionTreeClassifier(random_state=2018) #决策树模型svm = SVC(probability=True,random_state=2018,tol=1e-6) # SVM模型forest=RandomForestClassifier(n_estimators=100,random_state=2018) # 随机森林Gbdt=GradientBoostingClassifier(random_state=2018) #CBDTXgbc=XGBClassifier(random_state=2018) #XGBOOSTgbm=lgb.LGBMClassifier(random_state=2018) #LightGbm

6. 各种评分函数的构建

def muti_score(model): warnings.filterwarnings('ignore') accuracy = cross_val_score(model, wxl_X, wxl_y, scoring='accuracy', cv=5) precision = cross_val_score(model, wxl_X, wxl_y, scoring='precision', cv=5) recall = cross_val_score(model, wxl_X, wxl_y, scoring='recall', cv=5) f1_score = cross_val_score(model, wxl_X, wxl_y, scoring='f1', cv=5) auc = cross_val_score(model, wxl_X, wxl_y, scoring='roc_auc', cv=5) print("准确率:",accuracy.mean()) print("精确率:",precision.mean()) print("召回率:",recall.mean()) print("F1_score:",f1_score.mean()) print("AUC:",auc.mean()) model_name=["lr","tree","svm","forest","Gbdt","Xgbc","gbm"]for name in model_name: model=eval(name) print(name) muti_score(model)


版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:Java安全
下一篇:python爬虫的正则表达式(re模块)
相关文章

 发表评论

暂时没有评论,来抢沙发吧~