# 参考了 魔镜杯风控算法实战(三)调参篇
https://zhuanlan.zhihu.com/p/104376738
但是原文章中一些参数已经在最新的lightgbm 中被废弃了,auc-mean改名为valid auc-mean!所以,
代码也需要做一些细微的调整才可以运行起来,所以我写出这篇经验供大家参考。
1)先调参:
num_boost_round(n_estimators)
使用lgb自带的cv来计算,先限定learning_rate为0.1,这样调整别的参数的时候可以迭代得更快一些。
import time
import lightgbm as lgb
from lightgbm import log_evaluation, early_stopping
start = time.time()
lgb_train = lgb.Dataset(X_train,y_train)
lgb_test = lgb.Dataset(X_test,y_test,reference=lgb_train)
base_parmas={'boosting_type':'gbdt',
'learning_rate':0.1,
'num_leaves':31,
'max_depth':-1,
'bagging_fraction':0.7,
'feature_fraction':0.7,
'lambda_l1':0,
'lambda_l2':0,
'min_data_in_leaf':20,
'min_sum_hessian_inleaf':0.001,
'metric':'auc'}
callbacks = [log_evaluation(period=100), early_stopping(stopping_rounds=30)]
cv_result = lgb.cv(train_set=lgb_train,
num_boost_round=1000,
nfold=5,
stratified=True,
shuffle=True,
params=base_parmas,
metrics='auc',
callbacks=callbacks,
seed=0)
end = time.time()
print('迭代次数:{}'.format(len(cv_result['valid auc-mean'])))
print('交叉验证的AUC:{}'.format(max(cv_result['valid auc-mean'])))
print('运行时间为{}秒'.format(round(end-start,0)))
得到:
迭代次数:444 交叉验证的AUC:0.7941171177211479 运行时间为3.0秒
注意我改动了cv_result['valid auc-mean']) auc-mean已经更名为了'valid auc-mean
2)然后:
下面配合GridSearchCV时必须使用sklearn接口的lightgbm。
- num_leaves
这里选择先粗调再细调的策略,偷懒的话也可以直接细调,一步到位,但是代码会跑的久一些。
start = time.time()
params1={'num_leaves':list(range(50,150,5))}
model_lgb1=lgb.LGBMClassifier(
learning_rate=0.1,
n_estimators=444,
max_depth=-1,
bagging_fraction=0.7,
feature_fraction=0.7,
lambda_l1=0,
lambda_l2=0,
min_data_in_leaf=20,
min_sum_hessian_inleaf=0.001)
grid_search1=GridSearchCV(estimator=model_lgb1,cv=5,param_grid=params1,n_jobs=-1,scori