How to Analyze Company Risk Factors from SEC Reports with AI
Using custom NLP model and chatGPT
--
In today’s complex business landscape, organizations face a multitude of risks that can impact their operations and bottom line. Identifying and analyzing these risks, known as risk factor analysis, is crucial for effective decision-making and risk management strategies by investors. Traditionally, this process has relied heavily on manual efforts, often prone to errors and time-consuming. However, with the advent of AI technologies such as deep learning models, organizations now have the opportunity to leverage its power to enhance their risk factor analysis capabilities.
The “risk factors” section found in the 10-K report holds immense value in shedding light on critical areas that often escape the attention of many investors. While the majority of the content consists of standard risk disclosures, it is the insightful examination of factors such as new regulations and laws, market risk, macroeconomic conditions, that unveils hidden complexities.
In this tutorial, we delve into the key steps involved in training a custom AI model that identifies risk factors from SEC 10-K reports and integrating it into a workflow that analyses the results using chatGPT. We also highlight the importance of human-in-the-loop review for refining the model’s predictions and ensuring the accuracy of extracted risk factors.
Let’s get started!
Extracting Item 1A from 10-K Reports
For this tutorial, we are interested to extract relevant entities from the Risk Factor section (Item 1A) in the 10-K report. To do so, we use the Extractor API offered by sec-api.io which will provide us the raw text of the section we are interested in. A free API key is available here: https://sec-api.io/signup/free
Make sure to install the sec-api:
!pip install sec-api
And run the following script to download Item 1A section:
from sec_api import ExtractorApi
extractorApi = ExtractorApi("YOUR API KEY")
filing_url = "https://www.sec.gov/Archives/edgar/data/1318605/000156459021004599/tsla-10k_20201231.htm"
section_text = extractorApi.get_section(filing_url…