LogitBoost algorithm considering the cost of misclassification and its application in the classification of mobile user value

WANG Chaofa, SUN Jingchun

Systems Engineering - Theory & Practice ›› 2019, Vol. 39 ›› Issue (10) : 2702-2712.

PDF(783 KB)
PDF(783 KB)
Systems Engineering - Theory & Practice ›› 2019, Vol. 39 ›› Issue (10) : 2702-2712. DOI: 10.12011/1000-6788-2018-0194-11

LogitBoost algorithm considering the cost of misclassification and its application in the classification of mobile user value

  • WANG Chaofa1,2, SUN Jingchun2
Author information +
History +

Abstract

In the traditional LogitBoost algorithm, correct classification and error classification are treated equally, and the loss function does not converge to the cost-sensitive Bayesian decision. Based on the traditional LogitBoost algorithm, we propose a LogitBoost algorithm that takes the misclassification costs penalty into account. We verify the validity of the algorithm by using mobile phone user data of a mobile communication company. The results show that:Compared with the other similar algorithms, the classification improvement effect of the LogitBoost algorithm considering misclassification costs is obvious. As the misclassification cost ratio increases, the expected risks (under the same threshold) increase. The expected risks under the same misclassification cost ratio have a tendency of increasing first and then decreasing with the increase of the misclassification cost ratio. This conclusion not only shows that the introduction of misclassification costs can effectively reduce the expected risk of the model (this makes the new algorithm shift the focus from minimizing the misclassification cost ratio to minimizing the expected risks), but also provides an analysis framework and decision reference for communications companies.

Key words

algorithm / misclassification cost / user value / management advice

Cite this article

Download Citations
WANG Chaofa , SUN Jingchun. LogitBoost algorithm considering the cost of misclassification and its application in the classification of mobile user value. Systems Engineering - Theory & Practice, 2019, 39(10): 2702-2712 https://doi.org/10.12011/1000-6788-2018-0194-11

References

[1] Kenichi H. A simple extension of boosting for asymmetric mislabeled data[J]. Statistics & Probability Letters, 2012, 82(2):348-356.
[2] Friedman J H, Thevor R T. Additive logistic regression:A statistical view of boosting[J]. The Annals of Statistics, 2000, 38(2):337-374.
[3] Riza D, Christian P, Huacheng Z. On the short-term predictability of stock returns:A quantile Boosting approach[J]. Finance Research Letters, 2017, 22(3):35-41.
[4] Nicolás G P. Supervised projection approach for Boosting classifiers[J]. Pattern Recognition, 2009, 42(9):1742-1760.
[5] 张文生, 于廷照. Boosting算法理论与应用研究[J]. 中国科学技术大学学报, 2016, 46(3):222-230.Zhang W S, Yu T Z. Research on boosting theory and its applications[J]. Journal of University of Science and Technology of China, 2016, 46(3):222-230.
[6] George S, Hagen S. Boosting systems for large vocabulary continuous speech recognition[J]. Speech Communication, 2012, 54(2):212-218.
[7] Chen K, Wang S. Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2010, 33(1):129-143.
[8] Jafar T, Maarten S, Hamideh A. Boosting for multiclass semi-supervised learning[J]. Pattern Recognition Letters, 2014, 37(5):63-77.
[9] Bassam A S, Shahrul Azman M N, Mohd J A. RFBoost:An improved multi-label boosting algorithm and its application to text categorisation[J]. Knowledge-Based Systems, 2016, 103(8):104-117.
[10] 张玮, 杨善林, 刘婷婷. 基于CART和自适应Boosting算法的移动通信企业客户流失预测模型[J]. 中国管理科学, 2014, 22(10):90-96.Zhang W, Yang S L, Liu T T. Customer churn prediction in mobile communication enterprises based on CART and boosting algorithm[J]. Chinese Journal of Management Science, 2014, 22(10):90-96.
[11] 应维云, 蔺楠, 谢雅雅, 等. 用LDA Boosting算法进行客户流失预测[J]. 数理统计与管理, 2010, 29(3):400-408.Ying W Y, Lin N, Xie Y Y, et al. Research on the LDA boosting in customer churn prediction[J]. Journal of Applied Statistics and Management, 2010, 29(3):400-408.
[12] Fan W, Stolfo S J, Zhang J, et al. AdaCost:Misclassification cost-sensitive boosting[C]//Sixteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc, 1999, 129:97-105.
[13] Masnadishirazi H, Vasconcelos N. Asymmetric boosting[J]. Proc of ICML, 2007, 67(12):609-619.
[14] Masnadi S H, Vasconcelos N. Cost-sensitive boosting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(2):294-309.
[15] Allam S, Flowerdav S V, Flowerday E. Smartphone information security awareness:A victim of operational pressures[J]. Computers & Security, 2014, 15(4):56-65.
[16] Mylonas A, Kastania A, Gritzalis D. Delegate the smartphone user? Security awareness in amartphone platforms[J]. Computers & Security, 2013, 21(3):47-66.
[17] Jones B H, Chin A G. On the efficacy of smartphone security:A critical analysis of modifications in business students' practices over time[J]. International Journal of Information Management, 2015, 8(5):561-571.
[18] Xu C J, Zhang W X, He W Z, et al. The situation of waste mobile phone management in developed countries and development status in China[J]. Waste Management, 2016, 58(8):341-347.
[19] Ngoqo B, Flowerday S V. Information security behaviour profiling framework (ISBPF) for student mobile phone users[J]. Computers & Security, 2015, 53(10):132-142.
[20] 郑丽娟, 王洪伟, 郭恺强. 基于情感词模糊统计的网络评论情感强度的研究[J]. 系统管理学报, 2014, 23(3):324-330.Zheng L J, Wang H W, Guo K Q. Sentiment intensity of online reviews based on fuzzy statistics of sentiment words[J]. Journal of Systems & Management, 2014, 23(3):324-330.
[21] Veronika K, Akos P, Adam M. Mobile attachment:Separation from the mobile phone induces physiological and behavioural stress and attentional bias to separation-related stimuli[J]. Computers in Human Behavior, 2017, 71(6):228-239.
[22] 潘煜, 高丽, 王方华. 生活方式、顾客感知价值对中国消费者购买行为影响[J]. 系统管理学报, 2009, 18(6):601-607.Pan Y, Gao L, Wang F H. Influences of lifestyle and customer perceived values on Chinese consumer purchasing behaviors[J]. Journal of Systems & Management, 2009, 18(6):601-607.
[23] Takafumi K, Takashi T. Improving logitboost with prior knowledge[J]. Information Fusion, 2013, 14(2):208-219.
[24] Cao P, Zhao D Z, Zaiane O. A PSO-based cost-sensitive neural network for imbalanced data classification[C]//Trends and Applications in Knowledge Discovery and Data Mining Lecture Notes in Computer Science. Berlin:Springer, 2013, 7867:452-463.
[25] Zhang G Q, Sun H J, Ji Z X, et al. Cost-sensitive dictionary learning for face recognition[J]. Pattern Recognition, 2016, 60(11):613-629.
[26] 王金玉, 姚忠东, 潘德惠. 非全数据EV模型的参数辨识及可靠性分析[J]. 系统工程学报, 2003, 18(6):506-510.Wang J Y, Yao Z D, Pan D H. Parameter estimating and reliability analyzing for EV model with incomplete data[J]. Journal of Systems Engineering, 2003, 18(6):506-510.
[27] 李旭升, 郭春香, 郭耀煌. 扩展的树增强朴素贝叶斯网络信用评估模型[J]. 系统工程理论与实践, 2008, 28(6):129-136.Li X S, Guo C X, Guo Y H. The credit scoring model on extended tree augment naive Bayesian network[J]. Systems Engineering-Theory & Practice, 2008, 28(6):129-136.

Funding

National Natural Science Foundation of China (71372164)
PDF(783 KB)

491

Accesses

0

Citation

Detail

Sections
Recommended

/