This project started as an assessment for for the Udacity Data Analysis Professional Track.
The dataset is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because:
WeRateDogs has over 4 million followers and has received international media coverage.
Working with real world data is not a an easy process. Data needed to be gathered from different sources both manually and programmatically. after the data was cleaned, it was stored locally in separted files to appply the analysis and visualization to them.
wrangle_act.ipynb
: The main process of wrangling, analyzing and visualizing the data.wrangle_report.html
: The final summery of the data wrangling efforts exported fromwrangle_report.ipynb
.act_report.html
: The final data analysis and visualization exported formwrangle_act.ipynb
.twitter_archive_master.csv
: Cleaned data of from the Twitter archive and Twitter API.image_predictions_clean.csv
: The Cleaned data of the image predictions from the neural network.reportexport.sh
: a bach script to run if re-exporting the HTML files is needed.
- Python (3.8 or above recommended)
if you don't have the requirements you can install them using pip:
pip install -r requirements.txt
if you are using conda use the command below instead:
conda env create -f environment.yaml