Difference between revisions of "SNF P2P"

From EU COST Fin-AI
Jump to navigation Jump to search
Line 9: Line 9:
  
 
The research project will lead to the following contribution:
 
The research project will lead to the following contribution:
#''Methodological'' - With respect to the existing research (Ahelegbey et al., 2019a,b; Giudici et al., 2019; 2020), we will contribute in several aspects. First, previous methods have ignored loan status, leading to unsupervised network-based learning. However, supervised learning algorithms generally outperform unsupervised learning algorithms (Liu et al., 2020). Hence, our approach to creating the networks will utilize class information, thus leading to supervised networks. Second, in contrast to previous studies, our work acknowledges that network creation and feature extraction depend on a set of hyperparameters; their existence in turn depends on the specific way that networks and features are created. Thus, we will apply cross-validation to tune the network’s hyperparameters and resulting features. Third, previous studies have created only one network. We will also design methods to create multiple networks that will differ in two aspects:  
+
:''Methodological'' - With respect to the existing research (Ahelegbey et al., 2019a,b; Giudici et al., 2019; 2020), we will contribute in several aspects. First, previous methods have ignored loan status, leading to unsupervised network-based learning. However, supervised learning algorithms generally outperform unsupervised learning algorithms (Liu et al., 2020). Hence, our approach to creating the networks will utilize class information, thus leading to supervised networks. Second, in contrast to previous studies, our work acknowledges that network creation and feature extraction depend on a set of hyperparameters; their existence in turn depends on the specific way that networks and features are created. Thus, we will apply cross-validation to tune the network’s hyperparameters and resulting features. Third, previous studies have created only one network. We will also design methods to create multiple networks that will differ in two aspects:  
## They will utilize different (random) sets of variables, and  
+
# They will utilize different (random) sets of variables, and  
## They will rely on bootstrap aggregation (bagging). We will therefore directly address data noisiness, which is ignored in the existing literature.
+
# They will rely on bootstrap aggregation (bagging). We will therefore directly address data noisiness, which is ignored in the existing literature.
  
#''Empirical'' – Almost all studies have used fewer than two P2P market datasets, with three datasets used only occasionally (Ha et al., 2019; Niu et al., 2020). The fourth contribution is to enrich the empirical literature, as we will be able to observe credit drivers and the usefulness of methods across different market platforms. We will use not only benchmark datasets (Lending Club and Prosper) but also new datasets from European (Zopa, Mintos, Bondora) and world (Mintos, Home Credit, Kiva) P2P markets. With this, we will be able to validate models across very different datasets capturing the unique properties of different P2P markets around the world.
+
:''Empirical'' – Almost all studies have used fewer than two P2P market datasets, with three datasets used only occasionally (Ha et al., 2019; Niu et al., 2020). The fourth contribution is to enrich the empirical literature, as we will be able to observe credit drivers and the usefulness of methods across different market platforms. We will use not only benchmark datasets (Lending Club and Prosper) but also new datasets from European (Zopa, Mintos, Bondora) and world (Mintos, Home Credit, Kiva) P2P markets. With this, we will be able to validate models across very different datasets capturing the unique properties of different P2P markets around the world.
  
#''Practical'' – Except for a few studies (Bussmann et al. 2019, Srinivasan et al. 2019, Ariza-Garzon et al., 2020, Hadji Misheva et al. 2021 Moscato et al., 2021), the applicability and robustness of the explanations provided by existing XAI methods for credit risk models has rarely been studied. However, the future trend put forward by policymakers and regulators (through existing and planned legislation) concerning automatized solutions and P2P lending markets emphasizes the necessity of interpretable credit risk models. Thus, the fifth contribution of the proposed research project is an investigation into the applicability of existing XAI methods to credit scoring models, which will ultimately enable the development of interpretable credit risk models that can estimate not only the expected effect of each variable on the outcome but also the expected effect of each variable for a specific individual loan
+
:''Practical'' – Except for a few studies (Bussmann et al. 2019, Srinivasan et al. 2019, Ariza-Garzon et al., 2020, Hadji Misheva et al. 2021 Moscato et al., 2021), the applicability and robustness of the explanations provided by existing XAI methods for credit risk models has rarely been studied. However, the future trend put forward by policymakers and regulators (through existing and planned legislation) concerning automatized solutions and P2P lending markets emphasizes the necessity of interpretable credit risk models. Thus, the fifth contribution of the proposed research project is an investigation into the applicability of existing XAI methods to credit scoring models, which will ultimately enable the development of interpretable credit risk models that can estimate not only the expected effect of each variable on the outcome but also the expected effect of each variable for a specific individual loan

Revision as of 16:30, 30 October 2023

Abstract

P2P (peer-to-peer) lending today consists of the lending of money to individuals and businesses through online services without bank intermediation (Thakor, 2020). P2P platforms offer a secure cyberspace (Niu et al., 2020) where borrowers are linked to investors who engage (usually) in a buyout auction, where the bidding process ends when the loan has been fully funded (Xia et al., 2017). Bank lending is backed by deposits, uninsured debt and equity; thus, banks have skin in the game, unlike P2P lending platforms, where loans are funded by investors directly, i.e., through investors’ equity. Higher interest rates and diversification potential incentivize lenders, represented by individuals and recently also by banks, hedge funds, venture capital firms and private equity firms (Giudici et al. 2019a), to participate in P2P lending. Traditional banks receive loan repayments that are used to pay out depositors, subordinated debt holders and potentially shareholders, while P2P platforms receive fees from loan origination (paid by the borrower) and transaction fees. Administration of lending tends to be cheaper for P2P platforms, which provide an online marketplace and initial risk classification, while banks are subject to much tighter regulation and thus have higher costs (Thakor, 2020). However, banks have much richer data at their disposal (e.g., through long-term relational banking), which makes their task of identifying potential nonperforming loans easier. One would therefore expect P2P platforms to attract borrowers who would otherwise not be eligible for bank loans. This effect is amplified during recessions, as reduced access to bank credit directs riskier borrowers towards the P2P markets. This phenomenon has been observed empirically, as several studies have found that after the 2008 recession, the growth of P2P markets accelerated (e.g., Jin and Zhu, 2015). Similar growth is likely to unfold during and after the current worldwide economic crisis induced by the COVID-19 pandemic.Given the nature of P2P markets, they are characterized as immature industries with loose regulation, greater information asymmetry and increased credit risk, which all lead to higher default rates. This leaves the door open to considerable risks. To mitigate adverse selection and moral hazard problems, one needs to build trust. In traditional bank-lending markets, trust is constructed via relational banking, using collateral, certified accounts, risk monitoring, the presence of a board of directors, tighter regulation, etc. (Emekter, 2015). Voluntary implementation of these mechanisms would incur significant costs and thus marginalize the competitive edge of P2P lending markets. Several recent studies have found that the failure of P2P platforms in China is related to general market conditions (bond yields), ownership, information disclosure, and popularity, while political ties were found to also play an important role (e.g., Gao et al., 2021, He and Li, 2021). A hands-on approach to establishing trust between investors and P2P markets is to use accurate credit risk models. The main objective of the proposed research project is to design a state-of-the art and interpretable credit risk models for P2P lending markets.

Grant link


Contribution

The main objective of the research project is to advance our understanding of credit risk modeling in P2P lending markets by designing and empirically verifying new network-based credit risk models. Our task is to design methods suited to P2P lending market data, which are typically correlated and noisy.

The research project will lead to the following contribution:

Methodological - With respect to the existing research (Ahelegbey et al., 2019a,b; Giudici et al., 2019; 2020), we will contribute in several aspects. First, previous methods have ignored loan status, leading to unsupervised network-based learning. However, supervised learning algorithms generally outperform unsupervised learning algorithms (Liu et al., 2020). Hence, our approach to creating the networks will utilize class information, thus leading to supervised networks. Second, in contrast to previous studies, our work acknowledges that network creation and feature extraction depend on a set of hyperparameters; their existence in turn depends on the specific way that networks and features are created. Thus, we will apply cross-validation to tune the network’s hyperparameters and resulting features. Third, previous studies have created only one network. We will also design methods to create multiple networks that will differ in two aspects:
  1. They will utilize different (random) sets of variables, and
  2. They will rely on bootstrap aggregation (bagging). We will therefore directly address data noisiness, which is ignored in the existing literature.
Empirical – Almost all studies have used fewer than two P2P market datasets, with three datasets used only occasionally (Ha et al., 2019; Niu et al., 2020). The fourth contribution is to enrich the empirical literature, as we will be able to observe credit drivers and the usefulness of methods across different market platforms. We will use not only benchmark datasets (Lending Club and Prosper) but also new datasets from European (Zopa, Mintos, Bondora) and world (Mintos, Home Credit, Kiva) P2P markets. With this, we will be able to validate models across very different datasets capturing the unique properties of different P2P markets around the world.
Practical – Except for a few studies (Bussmann et al. 2019, Srinivasan et al. 2019, Ariza-Garzon et al., 2020, Hadji Misheva et al. 2021 Moscato et al., 2021), the applicability and robustness of the explanations provided by existing XAI methods for credit risk models has rarely been studied. However, the future trend put forward by policymakers and regulators (through existing and planned legislation) concerning automatized solutions and P2P lending markets emphasizes the necessity of interpretable credit risk models. Thus, the fifth contribution of the proposed research project is an investigation into the applicability of existing XAI methods to credit scoring models, which will ultimately enable the development of interpretable credit risk models that can estimate not only the expected effect of each variable on the outcome but also the expected effect of each variable for a specific individual loan