SNSF Narrative Digital Finance

From EU COST Fin-AI
Jump to navigation Jump to search

Abstract

Large fluctuations, instabilities, trends and uncertainty of financial markets constitute a substantial challenge for asset management companies, pension funds and regulators. Nowadays, most asset management companies and financial institutions follow a so-called systematic trading approach in their investment decisions. Systematic trading refers to applying predefined, rule-based trading strategies for buy- and sell orders. However, automated or rules-based trading activities bring certain risks for market participants and the whole financial market. In times of increased market volatility, market turmoil or so-called market sell-offs, investors applying similar trading rules might undertake the same actions, escalating and increasing systemic market risk through such behavior. Such situations have been frequently observed on financial markets for instance, in March 2020 (sell-off related to the Covid pandemic), during the European Sovereign Debt crisis and the global financial crisis 2007-08. Research in economics and management has begun to embrace the role that narratives play in guiding individual and collective decision-making. McCloskey (2011) describes unforeseen growth in economic development yet goes on to explain that no economic theory is able to capture this extent. She argues that a change in rhetoric had basically freed a social class (the bourgeoisie) and given it a sense of dignity and liberty. As such, economic change, she argues, depends to a great extent on social narratives that shape ideas and the beliefs of people. Yet, despite the notion that narratives, individual and collective actions, and market outcomes are inextricably linked, our knowledge about the mechanisms or processes through which they interact and how narratives can inform opinions or sway current thinking is still evolving. Entrepreneurs, for example, may use verbal communication to achieve plausibility (i.e., generate the sense that a given interpretation of events appears acceptable) or resonance (i.e., obtain alignment with the beliefs of the target audience; see van Werven et al., 2019). They may do so through rhetoric such as storytelling (Navis & Glynn, 2011) or crafting compelling arguments (van Werven et al., 2015) as well as employing combinations of figurative language and gesturing (Clarke et al., 2021) as they manage and conform with the expectation of their audience.

Outcomes of invoking narratives are consequential. The literature has indeed documented various forms of verbal communication–including written texts such as social media posts and blogs, or business plans or spoken text (Garud et al., 2014; Clarke et al., 2019, Clarke et al., 2021) – as a crucial means to secure support and investment. The narratives or rhetoric employed in these stories are used as vehicles for assembling and communicating details about ideas and future possibilities (Garud et al., 2014). In summary, narratives help audiences make sense of situations and situate the description into the audience's social and cultural framework (Lounsbury and Glynn, 2001).

In the following, we, therefore, explore computational techniques to predict financial market outcomes using text, speech, and video/picture data. Advances in data processing and machine learning allow new ways of analysing data and may have profound implications for empirical testing of lightly studied, yet complex, empirical financial relationships. This project therefore integrates various forms of narratives into the context of financial market analysis, leverages machine learning techniques, and aims to show how narratives are inextricably interwoven in the continuously unfolding financial market evolutions. We will extend quantitative research through novel measurement techniques, the creation of new data sets, offering new solutions towards prediction problems, and the induction of new theories (Obschonka & Audretsch, 2020). We will also contribute to recent works that demonstrated the potential of theoretical and methodological advancements through the application of machine learning in the research practice (Mullainathan & Spiess, 2017; von Krogh, 2018). In pursuit of both practical 'relevance' of our research (Wiklund et al., 2019) and the contribution of "AI-integrated" research (Levesque et al. 2020), our approach will provide actionable insights.

Grant Link


Approach

In a first step, we will design a tool allowing us to collect all relevant data from various data sources. Indeed, collecting purely financial data, such as stock prices or macroeconomic indicators, can be easily performed using subscription-based platforms such as Bloomberg, Reuters or Investing.com. However, textual data will constitute a substantial challenge in terms of (i) collecting from the web, (ii) formatting, and (iii) pre-processing, including dating and categorising. For this purpose, we will develop an automated tool which will collect textual data, categorise them, date and store them in an easy to analyse format. We will manage our database with SQL solutions. The second step will focus on our research questions and the four building blocks listed below.

We will formulate numerous data-driven general/main and block-specific research questions within our hypothesis-driven project. The main research questions will be:

  1. In what sense are financial markets (ex-ante) predictable?
  2. Is the ex-ante forecastability persistent, can it be applied for real use cases and to which extent?
  3. How can structural break detection and changes in financial time series improve and complement modern portfolio theory?

References

  1. Acemoglu, D., García-Jimeno, C., & Robinson, J. A. (2015). State capacity and economic development: A network approach. American Economic Review, 105(8), 2364-2409.
  2. Ahelegbey, D. F., Giudici, P., & Hadji-Misheva, B. (2019a). Latent factor models for credit scoring in P2P systems. Physica A: Statistical Mechanics and its Applications, 522, 112-121.
  3. Ahelegbey, D. F., Giudici, P., & Hadji-Misheva, B. (2019b). Factorial Network Models To Improve P2P Credit Risk Management. Frontiers in Artificial Intelligence, 2, 8.
  4. Ahmad, A., & Dey, L. (2007). A k-mean clustering algorithm for mixed numeric and categorical data. Data & Knowledge Engineering, 63(2), 503-527.
  5. Ariza-Garzon, M.J., Arroyo, J., Caparrini, A. and Segovia-Vargas, M. (2020). Explainability of a Machine Learning Granting Scoring Model in Peer-to-Peer Lending. IEEE.
  6. Arrieta, A. Dıaz-Rodrıguez, D., Del Sera, J., Bennetot, A. Tabikg, S., Barbadoh, A., Garciag. S., Gil- Lopeza, S., Molinag, D., Benjaminsh, R., Chatilaf, R. and Herrerag, F. (2019). Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arXiv:1910.10045v2.
  7. Arya, V, Bellamy K. E., Chen, P. Dhurandhar, A. Hind Samuel M., C. Hoffman, Houde, Q. Vera Liao, y Luss, Mojsilović, A. Mourad, S., Pedemonte, P., Ramya Raghavendra, John Richards, Prasanna Sattigeri Karthikeyan Shanmugam, et al. (2019). One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques. arXiv:1909.03012v2.
  8. Babaei, G., & Bamdad, S. (2020). A multi-objective instance-based decision support system for investment recommendation in peer-to-peer lending. Expert Systems with Applications, 150, 113278.
  9. Bastani, K., Asgari, E., & Namavari, H. (2019). Wide and deep learning for peer-to-peer lending. Expert Systems with Applications, 134, 209-224.
  10. Berg, M. and Kuiper, O. (2020). XAI in the Financial Sector. A Conceptual Framework for Explainable AI.
  11. Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J. and Eckersley, P. (2020). Explainable Machine Learning in Deployment. IBM Research.
  12. Billio, M., Getmansky, M., Lo, A. W., & Pelizzon, L. (2012). Econometric measures of connectedness and systemic risk in the finance and insurance sectors. Journal of financial economics, 104(3), 535-559.
  13. Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008.
  14. Bussmann, N., Giudici, P., Marinelli, D. and Papenbrock, J. (2020). Explainable Machine Learning in Credit Risk Management. Computational Economics, 57, 203-216.
  15. Byanjankar, A., Heikkilä, M., & Mezei, J. (2015, December). Predicting credit risk in peer-to-peer lending: A neural network approach. In 2015 IEEE Symposium Series on Computational Intelligence (pp. 719-725). IEEE.
  16. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority oversampling technique. Journal of artificial intelligence research, 16, 321-357.
  17. Cui, L., et al. (2016). P2P lending analysis using the most relevant graph-based features. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (pp. 3-14). Springer, Cham.
  18. Deev, O., & Lyócsa, Š. (2020). Connectedness of financial institutions in Europe: A network approach across quantiles. Physica A: Statistical Mechanics and its Applications, 550, 124035.
  19. Diebold, F. X., & Yılmaz, K. (2014). On the network topology of variance decompositions: Measuring the connectedness of financial firms. Journal of Econometrics, 182(1), 119-134.
  20. Duarte, J., Siegel, S., & Young, L. (2012). Trust and credit: The role of appearance in peer-to-peer lending. The Review of Financial Studies, 25(8), 2455-2484.
  21. Emekter, R., Tu, Y., Jirasakuldech, B., & Lu, M. (2015). Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending. Applied Economics, 47(1), 54-70.
  22. Florez-Lopez R., and J. M. Ramon-Jeronimo (2015). Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Syst. Appl., 42, 13, 5737–5753.
  23. Gao, M., Yen, J., & Liu, M. (2021). Determinants of defaults on P2P lending platforms in China. International Review of Economics & Finance, 72, 334-348.
  24. Giudici, P., Hadji-Misheva, B., & Spelta, A. (2019). Network based scoring models to improve credit risk management in peer to peer lending platforms. Frontiers in Artificial Intelligence, 2, 3.
  25. Giudici, P., Hadji-Misheva, B., & Spelta, A. (2020). Network based credit risk models. Quality Engineering, 32(2), 199-211.
  26. Guo, Y., Zhou, W., Luo, C., Liu, C., & Xiong, H. (2016). Instance-based credit risk assessment for investment decisions in P2P lending. European Journal of Operational Research, 249(2), 417-426.
  27. Ha, V. S., Lu, D. N., Choi, G. S., Nguyen, H. N., & Yoon, B. (2019, February). Improving credit risk prediction in online peer-to-peer (p2p) lending using feature selection with deep learning. In 2019 21st International Conference on Advanced Communication Technology (ICACT) (pp. 511-515). IEEE.
  28. Hadji Misheva, B., Osterrieder, J., Hirsa, A., Kulkarni, O. and Fung Lin, S. (2021). Explainable AI in Credit Risk Management, arXiv:2103.00949.
  29. Hansen, P. R., Lunde, A., & Nason, J. M. (2011). The model confidence set. Econometrica, 79(2), 453-497.
  30. Harikumar, S., & Surya, P. V. (2015). K-medoid clustering for heterogeneous datasets. Procedia Computer Science, 70, 226-237.
  31. He, Q., & Li, X. (2021). The failure of Chinese peer-to-peer lending platforms: Finance and politics. Journal of Corporate Finance, 66, 101852.
  32. Jin, Y., & Zhu, Y. (2015, April). A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending. In 2015 Fifth International Conference on Communication Systems and Network Technologies (pp. 609-613). IEEE.
  33. Kozodoi, N., Lessmann, S., Papakonstantinou, K., Gatsoulis, Y., & Baesens, B. (2019). A multi-objective approach for profit-driven feature selection in credit scoring. Decision support systems, 120, 106-117.
  34. Liang, K., & He, J. (2020). Analyzing credit risk among Chinese P2P-lending businesses by integrating text-related soft information. Electronic Commerce Research and Applications, 40, 100947.
  35. Lundberg, S.M. and Erion, G.G. and Lee, Su-In. (2018). Consistent Individualized Feature Attribution for Tree Ensembles. arXiv:1802.03888v3.
  36. Lundberg, S.M. Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz. R., Himmelfarb, J. Bansal, N. Lee, S. Explainable AI for Trees: From Local Explanations to Global Understanding. arXiv:1905.04610v1.
  37. Lundberg, S.M., Lee, Su-In. (2017). A Unified Approach to Interpreting Model Predictions. arXiv:1705.07874v2.
  38. Lyócsa, S., & Vašaničová, P. (2020). Default or Profit Scoring Credit Systems? Evidence from an Emerging High-Risk P2P Loan Market. Evidence from an Emerging High-Risk P2P Loan Market., (July 31, 2020).
  39. Malekipirbazari, M., & Aksakalli, V. (2015). Risk assessment in social lending via random forests. Expert Systems with Applications, 42(10), 4621-4631.
  40. Moscato, V., Picariello, A., & Sperlí, G. (2021). A benchmark of machine learning approaches for credit score prediction. Expert Systems with Applications, 165, 113986.
  41. Niu, K., Zhang, Z., Liu, Y., & Li, R. (2020). Resampling ensemble model based on data distribution for imbalanced credit risk evaluation in P2P lending. Information Sciences, 536, 120-134.
  42. Onnela, J. P., Kaski, K., & Kertész, J. (2004). Clustering and information in correlation based financial networks. The European Physical Journal B, 38(2), 353-362.
  43. Plawiak, P., et al. (2019). DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring. Information Sciences, 516, 401-418.
  44. Provenzano, et al. (2020). Machine Learning approach for Credit Scoring, arXiv:2008.01687.
  45. Ribeiro, M.T., Singh, S. and Guestrin, C. (2017). “Why Should I Trust You?” Explaining the Predictions of Any Classifier. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. arXiv:2008.01687.
  46. Robins, G., Pattison, P., Kalish, Y., & Lusher, D. (2007). An introduction to exponential random graph (p*) models for social networks. Social networks, 29(2), 173-191.
  47. Serrano-Cinca, C., & Gutiérrez-Nieto, B. (2016). The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending. Decision Support Systems, 89, 113-122.
  48. Serrano-Cinca, C., Gutiérrez-Nieto, B., & López-Palacios, L. (2015). Determinants of default in P2P lending. PloS one, 10(10), e0139427.
  49. Sokol, K. and Flach, P. (2020). Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches.
  50. Srinivasan, R., Chander, A. and Pezeshkpour, P. (2019). Generating User-friendly Explanations for Loan Denials using GANs. arXiv:1906.10244.
  51. Storcheus, D., Rostamizadeh, A., & Kumar, S. (2015, December). A survey of modern questions and challenges in feature extraction. In Feature Extraction: Modern Questions and Challenges (pp. 1-18). PMLR.
  52. Sun, J., H. Li, Q.-H. Huang, and K.Y. He, (2014). Predicting financial distress and corporate failure: A review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowl.-Based Syst., 57, 41–56.