Our Codebase
Repository link pending
- EDGAR Data Wrangling
- Download, process, and wrangle the EDGAR data from Huggingface
- Label Creation and Matching
- Before doing any EDA or Modelling we need to match the 10-Ks from EDGAR to bankruptcy filings in the following year
- Fine-tuning
- The bulk of our predictions come from a fine-tuned BERT model pre-trained on financial documents (finBERT, Yang 2020)
- Metrics and Evaluation
- We evaluated our results on a few different metrics, across different thresholds, ratios and evaluation methods