基于趋势情感映射的物价舆情词典及舆情指数构建研究

徐鹏, 尚维

系统工程理论与实践 ›› 2022, Vol. 42 ›› Issue (12) : 3381-3400.

PDF(2055 KB)
PDF(2055 KB)
系统工程理论与实践 ›› 2022, Vol. 42 ›› Issue (12) : 3381-3400. DOI: 10.12011/SETP2021-0423
论文

基于趋势情感映射的物价舆情词典及舆情指数构建研究

    徐鹏1,2, 尚维2
作者信息 +

The construction of inflation public opinion dictionary and index based on trend and sentiment mapping

    XU Peng1,2, SHANG Wei2
Author information +
文章历史 +

摘要

互联网新闻中反映的市场观点和情感可为经济监测预警提供及时有效的参考.为更好地识别和量化文本中对于经济变动趋势的观点和情感,本文提出了一种基于趋势情感映射的舆情词典构建方法.该方法识别描述经济趋势的核心词汇形成趋势种子词集,并集成情感词相关性计算结果,利用重新设计的标签传播算法得到映射系数,获得情感词的观点值,形成能够量化新闻的物价舆情词典.本文还提出了一种考虑句法结构的物价舆情指数模型,通过主题匹配、程度量化、否定识别等过程实现对特定领域经济新闻中的观点和情绪的更为精确的度量.实证分析中构建了物价舆情词典并生成物价和食品及其分项物价等11个主题的舆情指数,通过分类检验及与CPI的对比分析,发现基于本文方法所建立的舆情指数在长期趋势上领先于CPI约1.25个月.本文所提出的舆情词典构建方法和舆情指数模型具有可扩展性,有望应用于其他宏观经济或行业市场景气分析研究,是现有基于经济文本的预测预警建模方法的重要改进.

Abstract

The market opinions and expectations reflected in internet news provide timely and effective references for economic monitoring and early warning. In order to achieve a better identification and valuation of opinions and sentiments in economic news text, this paper proposed a public opinion dictionary construction method based on trend sentiment mapping. This method formed the trend seed word set by identifying the core key words reflecting economic trends. It integrated the calculation results of sentiment word correlation, and used the redesigned label propagation algorithm to obtain the mapping coefficient in order to obtained the opinion value of the sentiment word. Finally, it formed an inflation public opinion dictionary that can quantify news. We also proposed an inflation public opinion index model which considered the syntactic structure, through the process of topic matching, degree quantification and negative recognition to achieve a more accurate measurement of opinions and sentiments in economic news in a specific field. In the empirical analysis, an inflation public opinion dictionary was constructed and generated inflation public opinion indexes on eleven themes such as price, food and their subitems. Through classification test and comparative analysis with CPI, it was found that the public opinion index established based on the method in this paper was leading about 1.25 months than CPI in the long-term trend. The public opinion dictionary and index construction model proposed in this paper were scalable, and were expected to apply to other macroeconomics market. It was also the important improvement of the existing economic forecasting and early warning methods based on economic texts.

关键词

经济舆情词典 / 情感分析 / 舆情指数 / 互联网新闻 / 物价

Key words

economic public opinion dictionary / sentiment analysis / public opinion index / internet news / inflation

引用本文

导出引用
徐鹏 , 尚维. 基于趋势情感映射的物价舆情词典及舆情指数构建研究. 系统工程理论与实践, 2022, 42(12): 3381-3400 https://doi.org/10.12011/SETP2021-0423
XU Peng , SHANG Wei. The construction of inflation public opinion dictionary and index based on trend and sentiment mapping. Systems Engineering - Theory & Practice, 2022, 42(12): 3381-3400 https://doi.org/10.12011/SETP2021-0423
中图分类号: F224   

参考文献

[1] 成俊会, 张思, 吉清凯. 基于SNA的社会热点事件微博舆情阶段性传播网络的结构分析——以"于欢案"为例[J]. 管理评论, 2019, 31(3):295-304. Cheng J H, Zhang S, Ji Q K. Analysis of micro-blog public opinion periodic propagation network in social hotspot events based on SNA-An empirical study on "Yu Huan"[J]. Management Review, 2019, 31(3):295-304.
[2] Reed M. A study of social network effects on the stock market[J]. Journal of Behavioral Finance, 2016, 17(4):342-351.
[3] Mueller E. Effects of consumer attitudes on purchases[J]. The American Economic Review, 1957, 47(6):946-965.
[4] Curtin R T. Index construction:An appraisal of the index of consumer sentiment[J]. Surveys of Consumers, 1971, 72:253-261.
[5] 部慧, 解峥, 李佳鸿, 等. 基于股评的投资者情绪对股票市场的影响[J]. 管理科学学报, 2018, 21(4):86-101. Bu H, Xie Z, Li J H, et al. Investor sentiment extracted from internet stock message boards and itseffect on Chinese stock market[J]. Journal of Management Sciences in China, 2018, 21(4):86-101.
[6] Ma Y, Xu B, Xu X. Real estate confidence index based on real estate news[J]. Emerging Markets Finance and Trade, 2018, 54(4):747-760.
[7] 徐映梅, 高一铭. 基于互联网大数据的CPI舆情指数构建与应用——以百度指数为例[J]. 数量经济技术经济研究, 2017, 34(1):94-112. Xu Y M, Gao Y M. Construction of the public opinion index of CPI based on the internet big data[J]. The Journal of Quantitative & Technical Economics, 2017, 34(1):94-112.
[8] 蔡毅, 唐振鹏, 吴俊传, 等. 异质投资者情绪对股市的影响研究——基于文本语义分析[J]. 系统科学与数学, 2021, 41(11):3093-3108. Ci Y, Tang Z P, Wu J C, et al. Research on the influence of heterogeneous investor emotion on stock market:Based on text semantic analysis[J]. Journal of Systems Science and Mathematical Science, 2021, 41(11):3093-3108.
[9] 周燕芳, 温有栋, 杨君. 城镇居民未来物价预期指数影响因素研究[J]. 统计与决策, 2020(4):40-44. Zhou Y F, Wen Y D, Yang J. A study on influencing factors of urban residents' future price expectation index[J]. Statistics & Decision, 2020(4):40-44.
[10] 王晓丹, 尚维, 汪寿阳. 互联网新闻媒体报道对我国股市的影响分析[J]. 系统工程理论与实践, 2019, 39(12):3038-3047. Wang X D, Shang W, Wang S Y. The effects of online news on the Chinese stock market[J]. Systems Engineering-Theory & Practice, 2019, 39(12):3038-3047.
[11] 黄晓辉, 卢焱, 唐锡晋. 基于在线媒体的新冠疫情社会舆情多视角分析[J]. 系统科学与数学, 2021, 41(8):2182-2198. Huang X H, Lu Y, Tang X J. Multi-perspective analysis of public opinion related to COVID-19 based on online media[J]. Journal of Systems Science and Mathematical Science, 2021, 41(8):2182-2198.
[12] 张紫琼, 叶强, 李一军. 互联网商品评论情感分析研究综述[J]. 管理科学学报, 2010, 13(6):84-96. Zhang Z Q, Ye Q, Li Y J. A survey of sentiment analysis research on internet commodity reviews[J]. Journal of Management Sciences in China, 2010, 13(6):84-96.
[13] Yoo P D, Kim M H, Jan T. Machine learning techniques and use of event information for stock market prediction:A survey and evaluation[C]//International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), IEEE, 2005, 2:835-841.
[14] Nikfarjam A, Emadzadeh E, Muthaiyah S. Text mining approaches for stock market prediction[C]//2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), IEEE, 2010, 4:256-260.
[15] 张洁, 欧阳志刚. 经济增长与通货膨胀的趋势和周期——基于内生结构变化的共同趋势和相依周期的研究[J]. 系统工程理论与实践, 2021, 41(4):809-829. Zhang J, Ouyang Z G. The common trend and codependent cycle of economic growth and inflation in China[J]. Systems Engineering-Theory & Practice, 2021, 41(4):809-829.
[16] 张兴华, 罗彪. 人民币汇率对物价的不完全传导[J]. 系统工程理论与实践, 2017, 37(10):2512-2526. Zhang X H, Luo B. Study on incomplete transmission rate that the RMB exchange rate has on domestic price[J]. Systems Engineering-Theory & Practice, 2017, 37(10):2512-2526.
[17] 国显达, 那日萨, 崔少泽. 基于CNN-BiLSTM的消费者网络评论情感分析[J]. 系统工程理论与实践, 2020, 40(3):653-663. Guo X D, Na R S, Cui S Z. Consumer reviews sentiment analysis based on CNN-BiLSTM[J]. Systems Engineering-Theory & Practice, 2020, 40(3):653-663.
[18] Lemmon M, Portniaguina E. Consumer confidence and asset prices:Some empirical evidence[J]. The Review of Financial Studies, 2006, 19(4):1499-1529.
[19] He Z, He L, Wen F. Risk compensation and market returns:The role of investor sentiment in the stock market[J]. Emerging Markets Finance and Trade, 2019, 55(3):704-718.
[20] Baker M, Wurgler J, Yuan Y. Global, local, and contagious investor sentiment[J]. Journal of Financial Economics, 2012, 104(2):272-287.
[21] Li X, Xie H, Chen L, et al. News impact on stock price return via sentiment analysis[J]. Knowledge-Based Systems, 2014, 69:14-23.
[22] Cenesizoglu T. The reaction of stock returns to news about fundamentals[J]. Management Science, 2015, 61(5):1072-1093.
[23] Kalyani J, Bharathi P, Jyothi P. Stock trend prediction using news sentiment analysis[J]. arXiv preprint arXiv:01958, 2016.
[24] Agarwal A, Gupta A, Kumar A, et al. Learning risk culture of banks using news analytics[J]. European Journal of Operational Research, 2019, 277(2):770-783.
[25] Saleiro P, Soares C. Learning from the news:Predicting entity popularity on twitter[C]//International Symposium on Intelligent Data Analysis. Springer, Cham, 2016:171-182.
[26] 沈艳, 陈赟, 黄卓. 文本大数据分析在经济学和金融学中的应用:一个文献综述[J]. 经济学, 2019, 18(4):1153-1186.Shen Y, Chen Y, Huang Z. A literature review of textual analysis in economic and financial research[J]. China Economic Quarterly, 2019, 18(4):1153-1186.
[27] Yadav R, Kumar A V, Kumar A. News-based supervised sentiment analysis for prediction of futures buying behaviour[J]. IIMB Management Review, 2019, 31(2):157-166.
[28] Hausler J, Ruscheinsky J, Lang M. News-based sentiment analysis in real estate:A machine learning approach[J]. Journal of Property Research, 2018, 35(4):344-371.
[29] Ardia D, Bluteau K, Boudt K. Questioning the news about economic growth:Sparse forecasting using thousands of news-based sentiment values[J]. International Journal of Forecasting, 2019, 35(4):1370-1386.
[30] Manela A, Moreira A. News implied volatility and disaster concerns[J]. Journal of Financial Economics, 2017, 123(1):137-162.
[31] Baker S R, Bloom N, Davis S J. Measuring economic policy uncertainty[J]. The Quarterly Journal of Economics, 2016, 131(4):1593-1636.
[32] Alola A A, Uzuner G. The housing market and agricultural land dynamics:Appraising with economic policy uncertainty index[J]. International Journal of Finance and Economics, 2020, 25(2):274-285.
[33] Caldara D, Iacoviello M, Molligo P, et al. The economic effects of trade policy uncertainty[J]. Journal of Monetary Economics, 2020, 109:38-59.
[34] Wu C E, Tsai R T H. Using relation selection to improve value propagation in a conceptnet-Based sentiment dictionary[J]. Knowledge-Based Systems, 2014, 69:100-107.
[35] 周咏梅, 杨佳能, 阳爱民. 面向文本情感分析的中文情感词典构建方法[J]. 山东大学学报(工学版), 2013, 43(6):27-33. Zhou Y M, Yang J N, Yang A M. A method on building Chinese sentiment lexicon for text sentiment analysis[J]. Journal of Shandong University (Engineering Science), 2013, 43(6):27-33.
[36] 蒋盛益, 阳垚, 廖静欣. 中文音乐情感词典构建及情感分类方法研究[J]. 计算机工程与应用, 2014, 50(24):118-121. Jiang S Y, Yang Y, Liao J X. Research of building Chinese musical emotional lexicon and emotional classification[J]. Computer Engineering and Applications, 2014, 50(24):118-121.
[37] 桂斌, 杨小平, 张中夏,等. 基于微博表情符号的情感词典构建研究[J]. 北京理工大学学报, 2014, 34(5):537-541. Gui B, Yang X P, Zhang Z X, et al. Research on building lexicon for sentiment analysis based on the Chinese microblogging smiley[J]. Transactions of Beijing Institute of Technology, 2014, 34(5):537-541.
[38] 杜锐, 朱艳辉, 田海龙, 等. 基于平滑SO-PMI算法的微博情感词典构建方法研究[J]. 湖南工业大学学报, 2015, 29(5):77-81. Du R, Zhu Y H, Tian H L, et al. Research on construction method of Weibo emotional dictionary based on smooth SO-PMI algorithm[J]. Journal of Hunan University of Technology, 2015, 29(5):77-81.
[39] 郭顺利, 张向先. 面向中文图书评论的情感词典构建方法研究[J]. 现代图书情报技术, 2016, 32(2):67-74. Guo S L, Zhang X X. Building sentiment analysis dictionary for Chinese book reviews[J]. New Technology of Library and Information Service, 2016, 32(2):67-74.
[40] 阳爱民, 林江豪, 周咏梅. 中文文本情感词典构建方法[J]. 计算机科学与探索, 2013, 7(11):1033-1039. Yang A M, Lin J H, Zhou Y M. Method on building Chinese text sentiment lexicon[J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(11):1033-1039.
[41] Xue B, Fu C, Shaobin Z. A study on sentiment computing and classification of sina weibo with word2vec[C]//2014 IEEE International Congress on Big Data, IEEE, 2014:358-363.
[42] 冯超, 梁循, 李亚平, 等. 基于词向量的跨领域中文情感词典构建方法[J]. 数据采集与处理, 2017, 32(3):579-587. Feng C, Liang X, Li Y P, et al. Construction method of Chinese cross-domain sentiment lexicon based on word vector[J]. Journal of Data Acquisition & Processing, 2017, 32(3):579-587.
[43] 杨小平, 张中夏, 王良, 等. 基于Word2Vec的情感词典自动构建与优化[J]. 计算机科学, 2017, 44(1):42-47. Zhang X P, Zhang Z X, Wang L, et al. Automatic construction and optimization of sentiment lexicon based on Word2Vec[J]. Computer Science, 2017, 44(1):42-47.
[44] 蒋翠清, 郭轶博, 刘尧. 基于中文社交媒体文本的领域情感词典构建方法研究[J]. 数据分析与知识发现, 2019, 3(2):98-107. Jiang C Q, Guo Y B, Liu Y. Constructing a domain sentiment lexicon based on Chinese social media text[J]. Data Analysis and Knowledge Discovery, 2019, 3(2):98-107.
[45] 李寿山, 李逸薇, 黄居仁,等. 基于双语信息和标签传播算法的中文情感词典构建方法[J]. 中文信息学报, 2013, 27(6):75-82. Li S S, Li Y W, Huang J R, et al. Construction of Chinese sentiment lexicon using bilingual information and label propagation algorithm[J]. Journal of Chinese Information Processing, 2013, 27(6):75-82.
[46] 周咏梅, 阳爱民, 杨佳能. 一种新闻评论情感词典的构建方法[J]. 计算机科学, 2014, 41(8):67-69. Zhou Y M, Yang A M, Yang J N. Construction method of sentiment lexicon for news reviews[J]. Computer Science, 2014, 41(8):67-69.
[47] Cho S H, Kang H B. Text sentiment classification for SNS-based marketing using domain sentiment dictionary[C]//2012 IEEE International Conference on Consumer Electronics (ICCE), IEEE, 2012:717-718.
[48] Ahmed M, Chen Q, Li Z. Constructing domain-dependent sentiment dictionary for sentiment analysis[J]. Neural Computing Applications, 2020, 32:14719-14732.
[49] Ghiassi M, Skinner J, Zimbra D. Twitter brand sentiment analysis:A hybrid system using n-gram analysis and dynamic artificial neural network[J]. Expert Systems with Applications, 2013, 40(16):6266-6282.
[50] Whissell C. Using the revised dictionary of affect in language to quantify the emotional undertones of samples of natural language[J]. Psychological Reports, 2009, 105(2):509-521.
[51] 洪巍, 李敏. 文本情感分析方法研究综述[J]. 计算机工程与科学, 2019, 41(4):750-757. Hong W, Li M. A survey of research on text sentiment analysis methods[J]. Computer Engineering & Science, 2019, 41(4):750-757.
[52] Soo C K. Quantifying sentiment with news media across local housing markets[J]. The Review of Financial Studies, 2018, 31(10):3689-3719.
[53] Walker C B. Housing booms and media coverage[J]. Applied Economics, 2014, 46(32):3954-3967.
[54] Aparicio D, Bertolotto M I. Forecasting inflation with online prices[J]. International Journal of Forecasting, 2020, 36(2):232-247.
[55] Harchaoui T M, Janssen R V. How can big data enhance the timeliness of official statistics?:The case of the US consumer price index[J]. International Journal of Forecasting, 2018, 34(2):225-234.
[56] Powell B J, Nason G, Elliott D, et al. Tracking and modelling prices using web-scraped price microdata:Towards automated daily CPI forecasting[J]. Journal of the Royal Statistical Society:Series A, 2018, 181(3):737-756.
[57] Guzman G. Internet search behavior as an economic forecasting tool:The case of inflation expectations[J]. Journal of Economic and Social Measurement, 2011, 36(3):119-167.
[58] 张崇, 吕本富, 彭赓, 等. 网络搜索数据与CPI的相关性研究[J]. 管理科学学报, 2012, 15(7):50-59+70. Zhang C, Lü B F, Peng G, et al. A study on correlation between web search data and CPI[J]. Journal of Management Sciences in China, 2012, 15(7):50-59+70.
[59] 雷怀英, 王毓彬, 吴英明. 基于网络检索数据的CPI预测研究[J]. 经济问题, 2018, 470(10):22-26.Lei H Y, Wang Y B, Wu Y M. CPI prediction based on network retrieving data[J]. On Economic Problems, 2018, 470(10):22-26.
[60] Vosen S, Schmidt T. Forecasting private consumption:Survey based indicators vs. Google trends[J]. Journal of Forecasting, 2011, 30(6):565-578.
[61] Qu Y, Shang W, Wang S. Webpage mining for inflation emergency early warning[C]//International Conference on Web-Age Information Management, Springer, Berlin, Heidelberg, 2013:211-222.
[62] Li X, Shang W, Wang S, et al. A MIDAS modelling framework for Chinese inflation index forecast incorporating Google search data[J]. Electronic Commerce Research and Applications, 2015, 14(2):112-125.
[63] 刘博, 彭凯越, 唐晓彬. 基于互联网大数据背景下的CPI预测研究[J]. 经济统计学(季刊), 2018, 2018(1):106-117. Liu B, Peng K Y, Tang X B. Research based on the background of big data in the CPI prediction[J]. China Economic Statistics Quarterly, 2018, 2018(1):106-117.
[64] Martineau J, Finin T. Delta TFIDF:An improved feature space for sentiment analysis[C]//International Conference on Weblogs and Social Media, Icwsm 2009, San Jose, California, USA, May. DBLP, 2009.
[65] Mihalcea R, Tarau P. Textrank:Bringing order into text[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004:404-411.
[66] 陈宏朝, 李飞, 朱新华, 等. 基于路径与深度的同义词词林词语相似度计算[J]. 中文信息学报, 2016, 30(5):80-88. Chen H Z, Li F, Zhu X H, et al. A path and depth-Based approach to word semantic similarity calcalation in CiLin[J]. Journal of Chinese Information Processing, 2016, 30(5):80-88.
[67] 朱新华, 马润聪, 孙柳, 等. 基于知网与词林的词语语义相似度计算[J]. 中文信息学报, 2016, 30(4):29-36. Zhu X H, Ma R C, Sun L, et al. Word semantic similarity computation based on HowNet and CiLin[J]. Journal of Chinese Information Processing, 2016, 30(4):29-36.
[68] Zhu X, Ghahraman Z. Learning from labeled and unlabeled data with label propagation[R]. Technical Report CMU-CALD-02-107, 2002.
[69] 张俊丽, 常艳丽, 师文. 标签传播算法理论及其应用研究综述[J]. 计算机应用研究, 2013, 30(1):21-25.Zhang J L, Chang Y L, Shi W. Overview on label propagation algorithm and applications[J]. Application Research of Computers, 2013, 30(1):21-25.
[70] 王珏, 周志华, 周傲英. 机器学习及其应用[M]. 北京:清华大学出版社, 2006. Wang J, Zhou Z H, Zhou A Y. Machine learning and applications[M]. Beijing:Tsinghua University Press, 2006.
[71] Podobnik B, Stanley H E. Detrended cross-correlation analysis:A new method for analyzing two nonstationary time series[J]. Physical Review Letters, 2008, 100(8):84-102.
[72] 潘虹, 卢军, 郭旭展, 等. 基于峰谷分析算法用针刺仪测定树木年龄的可行性分析[J]. 林业科学研究, 2020, 33(5):48-54. Pan H, Lu J, Guo X Z, et al. Feasibility analysis of tree age estimation algorithm using Resistograph based on peak-valley analysis[J]. Forest Research, 2020, 33(5):48-54.
[73] 中国国家统计局. 我国居民消费价格指数(CPI)是如何调查和生成的[R]. 2012. National Bureau of Statistics of China. How to investigate and generate the consumer price index (CPI)[R]. 2012.
[74] Xu P, Shang W. A public opinion index for chinese real estate market monitoring based on context related sentiment scoring[C]//PACIS 2020 Proceedings, 2020:43.

基金

国家自然科学基金(72271229,71988101,71571180);中国科学院大学数字经济监测预测预警与政策仿真教育部哲学社会科学实验室(培育)基金
PDF(2055 KB)

608

Accesses

0

Citation

Detail

段落导航
相关文章

/