This project focuses on predicting the closure of small and medium-sized enterprises (SMEs) using Business Trends and Outlook Survey Data. Key aspects include:
- Data Utilization: Leveraged survey data to analyze and predict SME closures.
- Machine Learning Models: Implemented models using R, with packages such as
randomForest
,catboost
, andBART
. - Performance Evaluation: Assessed models with metrics like AUROC, F1 score, and accuracy.
- Key Findings: Highlighted the importance of including non-financial data for accurate closure predictions.
About the project in Korean.pdf
: Comprehensive project documentation in Korean, covering the project overview, data details, ML models used, performance results, and key findings. Includes detailed preprocessing information.About the project.pdf
: Summary of the project in English.Summary statistics.pdf
: Contains summary statistics for the variables used in the analysis.Numble reflections.pdf
: Reflections on the project, written in Korean, detailing insights and lessons learned.
Numble Project.Rmd
: R Markdown file with complete project code, from data preprocessing to model evaluation.Numble Project.R
: R script with all code for data preprocessing, model training, and evaluation.
Due to a contract with the competition organization, the dataset used in this project cannot be uploaded. While the provided code will not include the dataset, it offers a comprehensive understanding of the project’s methodology and analysis.
- Nayeon Kwon - Sourcing non-financial data, data preprocessing, supporting building ML models, documentation
- Younghoon Yoo - Automated data preprocessing, building ML models, code optimization
Feel free to explore the repository and check out the PDFs for detailed project information.