Overcoming Information Asymmetries in Credit Markets: Machine Learning at LendingClub
Machine learning techniques are starting to break down inefficiencies in lending markets, enabling wider access to credit and putting downwards pressure on borrowing rates. This post explores how LendingClub uses machine learning to reach its strategic goals, and where the potential for process improvement still exists.
[793 words]
The age-old issue faced by the lending industry is information asymmetry: will a borrower have the ability and the willingness to repay a loan? Traditional credit scoring methods are imperfect: good potential borrowers are shut out of the credit market because they do not “tick the box”, and approved borrowers continue to default. Today, fintechs such as LendingClub are seeking new ways to reduce the information asymmetries that cause these frictions in the market, and ultimately lead to inefficient capital allocation and higher borrowing costs.
LendingClub is the largest online lending marketplace in the US, having issued $41.6bn in loans since inception in 2007[1]. The platform is growing quickly, with loan origination up 18% year-over-year in the quarter ending September 2018[2]. LendingClub aims to increase efficiency and affordability in the lending market, by using large datasets and machine-learning techniques that go beyond traditional credit scoring to reach new borrowers and improve risk detection. LendingClub’s strategy is to pass these improvements onto consumers in the form of lower borrowing rates.
Built into LendingClub’s model is a dataset based on over 10 years of underwriting and more than 2.5 million customers, with information such as transactional data, behavioral data and employment information. Techniques used by the underlying model include analyzing trended data, looking at a borrower’s changing credit behavior over time, and taking a granular view at various credit balances across a borrower’s portfolio. With a large number of datapoints per borrower and an ever-increasing borrower group, the company has been able to refine its model over time.[3]
Research undertaken by the Federal Reserve Bank of Philadelphia found that LendingClub has been able to undercut interest rate spreads offered by traditional lenders as a result of its methodologies, and that previously underserved borrowers are being given access to credit. Furthermore, the research founds that LendingClub’s credit scoring algorithm has shown increasing divergence from traditional credit scoring metrics over time. Despite the low correlation with traditional credit scores, LendingClub ratings proved a good predictor for loan delinquency, indicating that the learning model is working.[4]
Despite these positive signals, certain parts of LendingClub’s process are not yet automated. One key area is honesty: the company’s algorithms cannot assess whether potential borrowers are being truthful in their loan applications. LendingClub therefore verifies information manually, for example reaching out in person to HR departments in order to verify a borrower’s employment status. According to a 2010 Harvard Case Study, 50% of pre-approved borrowers requiring verification did not make it past the manual review phase at LendingClub, because they did not respond to the verification request or provided insufficient information[5].
A potential solution to this remaining manual process lies with Ping An, the Chinese financial services conglomerate, which has created a system that detects whether potential borrowers are being truthful in their credit applications by analyzing facial movements made during an interview over a smartphone, using machine learning techniques. Ping An claims the technology has reduced credit losses by 60%[6]. Such a system could potentially remove the need for verification of information, providing a significant further reduction in informational asymmetries and corporate expenses for LendingClub. The question arises: would US consumers permit such a method to be used from an ethical standpoint?
Usage of credit scoring algorithms and machine learning could also have huge potential in uses outside of the consumer lending space. 93% of LendingClub’s loans as of September 2018 were personal loans, with a tiny percentage being small business loans[7]. In the medium term, if success in the business lending space can be proven, there is potential for machine-learning techniques to be scaled up for use in the wider corporate space. There are inherent inefficiencies in banks’ largely manual corporate credit underwriting and loan origination processes, feeding into substantial underwriting fees. Are machine learning technologies transferable from the consumer to the corporate lending space? This remains to be proven on a large scale, but LendingClub could benefit hugely if it can leverage the opportunity.
[1] Lending Club, “Lending Club Statistics,” https://www.lendingclub.com/info/statistics.action, accessed November 2018.
[2] Lending Club, “Press Release: LendingClub Reports Third Quarter 2018 Results”, https://ir.lendingclub.com/file/Index?KeyFile=395660550, accessed November 2018.
[3] Lending Club, “Blog: The Power of Data and the Next Generation Credit Model” https://blog.lendingclub.com/lendingclubs-next-generation-credit-model, accessed November 2018.
[4] Julapa Jagtiani and Catharine Lemieux, “The Roles of Alternative Data and Machine Learning in Fintech Lending: Evidence from the LendingClub Consumer Platform”, Federal Reserve Bank of Philadelphia Working Paper WP 18-15, April 2018. https://www.philadelphiafed.org/-/media/research-and-data/publications/working-papers/2018/wp18-15.pdf, Accessed November 2018.
[5] Tufano P, Jackson H, Ryan A. “Lending Club”. HBS No. 9-210-052. Boston: Harvard Business School Publishing, 2010.
[6] Oliver Ralph, Don Weinland and Martin Arnold, “Chinese banks start scanning borrowers’ facial movements”. Financial Times, 28 October 2018, https://www.ft.com/content/4c3ac2d4-d865-11e8-ab8e-6be0dcf18713, accessed November 2018.
[7] Lending Club, “Press Release: LendingClub Reports Third Quarter 2018 Results”, https://ir.lendingclub.com/file/Index?KeyFile=395660550, accessed November 2018.
Your point about honesty being detected in the application by a machine learning algorithm is very interesting. I wonder if they could use data from previous applications of users that paid back their loans as input to a deep learning algorithm that can help detect application patterns that make them more likely to not default. On the flipside, I wonder that if this market becomes big enough, someone can create an algorithm that tries to “game” the system.
Great post Emma. While detecting honesty is not going to be part of a loan application, LendingClub’s powerful ability bridges a key gap in the traditional loan system. Synthesizing large data to detect patterns gives LendingClub is a great advantage but I am not sure if large computing powers are needed to warrant a full scale machine learning tool. One may argue that the algorithm is already being implemented in house by traditional banks using a mix of human and less sophisticated computer analysis. Nevertheless , I think it is a matter of time before a machine learning scheme of some form becomes the mainstream in loan application analysis.
Fascinating post. Like Kombucha said, I think the honesty side of the equation is an incredibly interesting problem to solve. As we have discussed in class – when it comes to machine learning in general, the output is only as good as the data that is input. I wonder if facial recognition techniques are warranted, or if they could simply apply the types of algorithms they are already using, to filter the applications into low and high risk profiles. Then, LendingClub could use their own assets more efficiently by focusing resources on high risk files.
Additionally, I completely agree with your insight into the other lending spaces (beyond consumer loans) that this technology could be applied to.
Great article Emma! Peer-to-peer lending not only opens up access to credit for borrowers, but also provides debt investing opportunities for the common man and woman. Apart from the authentication issues you mention, once concern I have is LC’s and other fintech firms hesitation to be forthecoming about the actual NPL performance of their books, especially for loans on the bigger end. Sustained performance improvements will be critical to widespread adoption.
I found the article very interesting, considering the restrictions and flaws that the credit industry has historically faced. Predicting borrowers behavior is per se an extremely challenging task, not only because it involves many external factors out of the control of the lender (such as economic conditions or regulators, competitors, and borrowers unexpected behaviors), but also many internal ones that rely on the effectiveness of the credit approval processes and the objectiveness of the analysts involved, among others. On the one hand, introducing machine learning to the credit assessment process can effectively address, at least, the flaws and variability of the internal factors, by reducing the dependence on the judgment of the analysts, therefore eliminating possible biases. On the other, machine learning can also be extremely useful to better predict the external factors, mainly those related to borrower behavior during financial stress, but the effectiveness of the LendingClub algorithms will not, for me, still fully proven until we can see the accuracy during crises times. Additionally, other challenges that the system may face if the company plans to escalate and expand, relate to the replicability of the model to markets with significant cultural differences, lack of reliable historical data, or with volatile economies.