Bancarotta

Enhancing Bankruptcy Prediction with NLP

Basis for NLP Models

Bankruptcy Prediction in 2023

SEC EDGAR: Electronic Data Gathering, Analysis, and Retrieval system

Item 3
Legal Proceedings
Item 7
Management Discussion & Analysis
Item 7A
Quantitative and Qualitative Disclosures About Market Risk
10K
edgar

Dataset

EDGAR Corpus in 2023

Edgar Corpus Dataset consists of 241,000 filings

Item 7 - Total
~1.8 billion tokens
Item 7 - Average
~9 thousand token length