Study finds large language models produce racial disparities in mortgage lending but that the disparities can be eliminated

Donald E. Bowen III of Lehigh University, S. McKay Price of Lehigh University – Perella Department of Finance, Luke C.D. Stein of Babson College, and Ke Yang of Lehigh University have written Measuring and Mitigating Racial Disparities in Large Language Model Mortgage Underwriting. Here’s the abstract:

We conduct the first study exploring the application of large language models (LLMs) to mortgage underwriting, using an audit study design that combines real loan application data with experimentally manipulated race and credit scores. First, we find that LLMs systematically recommend more denials and higher interest rates for Black applicants than otherwise-identical white applicants. These racial disparities are largest for lower-credit-score applicants and riskier loans, and exist across multiple generations of LLMs developed by three leading firms. Second, we identify a straightforward and effective mitigation strategy: Simply instructing the LLM to make unbiased decisions. Doing so eliminates the racial approval gap and significantly reduces interest rate disparities. Finally, we show LLM recommendations correlate strongly with real-world lender decisions, even without fine-tuning, specialized training, macroeconomic context, or extensive application data. Our findings have important implications for financial firms exploring LLM applications and regulators overseeing AI’s rapidly expanding role in finance.

Leave a Reply

Your email address will not be published. Required fields are marked *