The future of financial analysis: How GPT-4 is disrupting the industry, according to new research

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

Researchers from the University of Chicago have demonstrated that large language models (LLMs) can conduct financial statement analysis with accuracy rivaling and even surpassing that of professional analysts. The findings, published in a working paper titled “Financial Statement Analysis with Large Language Models,” could have major implications for the future of financial analysis and decision-making.

The researchers tested the performance of GPT-4, a state-of-the-art LLM developed by OpenAI, on the task of analyzing corporate financial statements to predict future earnings growth. Remarkably, even when provided only with standardized, anonymized balance sheets, and income statements devoid of any textual context, GPT-4 was able to outperform human analysts.

“We find that the prediction accuracy of the LLM is on par with the performance of a narrowly trained state-of-the-art ML model,” the authors write. “LLM prediction does not stem from its training memory. Instead, we find that the LLM generates useful narrative insights about a company’s future performance.”

A study by researchers at the University of Chicago found that OpenAI’s GPT-4 model outperformed human analysts in predicting corporate earnings, achieving an accuracy score of 0.604 and an F1 score of 0.609. The researchers used a novel approach of providing structured financial data and “chain-of-thought” prompts to guide the AI’s reasoning. (Source: University of Chicago)

Chain-of-thought prompts emulate human analyst reasoning

A key innovation was the use of “chain-of-thought” prompts that guided GPT-4 to emulate the analytical process of a financial analyst, identifying trends, computing ratios, and synthesizing the information to form a prediction. This enhanced version of GPT-4 achieved a 60% accuracy in predicting the direction of future earnings, notably higher than the 53-57% range of human analyst forecasts.

VB Event

The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

“Taken together, our results suggest that LLMs may take a central role in decision-making,” the researchers conclude. They note that the LLM’s advantage likely stems from its vast knowledge base and ability to recognize patterns and business concepts, allowing it to perform intuitive reasoning even with incomplete information.

University of Chicago researchers tested GPT4’s financial analysis capabilities by providing it with anonymized, standardized financial statements and guiding its reasoning with “chain-of-thought” prompts. The model then predicted the direction, magnitude, and confidence of future earnings changes. (Source: University of Chicago)

LLMs poised to transform financial analysis despite challenges

The findings are all the more remarkable given that numerical analysis has traditionally been a challenge for language models. “One of the most challenging domains for a language model is the numerical domain, where the model needs to carry out computations, perform human-like interpretations, and make complex judgments,” said Alex Kim, one of the study’s co-authors. “While LLMs are effective at textual tasks, their understanding of numbers typically comes from the narrative context and they lack deep numerical reasoning or the flexibility of a human mind.”

Some experts caution that the “ANN” model used as a benchmark in the study may not represent the state-of-the-art in quantitative finance. “That ANN benchmark is nowhere near state of the art,” commented one practitioner on the Hacker News forum. “People didn’t stop working on this in 1989 — they realized they can make lots of money doing it and do it privately.”

Nevertheless, the ability of a general-purpose language model to match the performance of specialized ML models and exceed human experts points to the disruptive potential of LLMs in the financial domain. The authors have also created an interactive web application to showcase GPT-4’s capabilities for curious readers, though they caution that its accuracy should be independently verified.

As AI continues its rapid advance, the role of the financial analyst may be the next to be transformed. While human expertise and judgment are unlikely to be fully replaced anytime soon, powerful tools like GPT-4 could greatly augment and streamline the work of analysts, potentially reshaping the field of financial statement analysis in the years to come.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link