ML之LoR:利用信用卡数据集(欠采样{Nearmiss/Kmeans/TomekLinks/ENN}、过采样{SMOTE/ADASYN})同时采用LoR算法(PR和ROC评估)进行是否欺诈二分类

ML之LoR:利用布鲁塞尔的creditcard数据集进行采样处理(欠采样{Nearmiss/Kmeans/TomekLinks/ENN}、过采样{SMOTE/ADASYN})同时采用LoR算法(PR和ROC评估)进行是否欺诈二分类


利用布鲁塞尔的creditcard数据集进行采样处理(欠采样{Nearmiss/Kmeans/TomekLinks/ENN}、过采样{SMOTE/ADASYN})同时采用LoR算法(PR和ROC评估)进行是否欺诈二分类

设计思路

输出结果

实现代码

更新……

F:\Program Files\Python\Python36\lib\site-packages\matplotlib\axes\_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.
  warnings.warn("The 'normed' kwarg is deprecated, and has been "
0    284315
1       492
Name: Class, dtype: int64
Default 方法
Undersampling RandomUnderSampler 方法
F:\Program Files\Python\Python36\lib\site-packages\imblearn\under_sampling\_prototype_selection\_nearmiss.py:178: UserWarning: The number of the samples to be selected is larger than the number of samples available. The balancing ratio cannot be ensure and all samples will be returned.
  "The number of the samples to be selected is larger"
Undersampling NearMissV1 方法
F:\Program Files\Python\Python36\lib\site-packages\sklearn\svm\_base.py:977: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  "the number of iterations.", ConvergenceWarning)
Undersampling NearMissV2 方法
Undersampling NearMissV3 方法
Undersampling ClusterCentroids 方法
Undersampling TomekLinks 方法
Undersampling EditedNearestNeighbours 方法
数据清洗后大类样本数量
Original:  227451
After Tomek Link:  227429
After ENN:  227326
Oversampling RandomOverSampler 方法
Oversampling SMOTE 方法
Oversampling ADASYN 方法
(0)

相关推荐