A Gutenberg Text Annotation Analysis project for my INFO 159 final project at UC Berkeley.
Utilized BERT to predict the age range of English-language Gutenberg texts as primary (ages 9-12), secondary (ages 13-16), or tertiary (ages 17+). Our final model achieved 60% accuracy with a 95% confidence interval of [0.532, 0.668].
Please refer to our AP4 Analysis for our analysis. PDF of Jupyter Notebook code with results can be found here.