Welcome to the Data Wrangling Quest! In this project, our team is on a mission to clean and refine a messy dataset known as “Shark Attacks” using various data wrangling techniques. By collaborating effectively, we will prepare the dataset for analysis tailored to a use case of our choice. Throughout this journey, we’ll enhance our Python skills, progress toward becoming proficient data analysts, strengthen our teamwork, and sharpen our problem-solving abilities. Are we ready? Let’s dive in!
As a team, we will start by examining the Shark Attack dataset to understand its structure and collaboratively develop one or more hypotheses about the data. For example, we might hypothesize that shark attacks are more frequent in certain geographical areas or among individuals participating in specific activities.
To elevate our project, we will also define a Business Case. Throughout the project, we will utilize Python and the pandas library to implement at least five data cleaning techniques to tackle issues like missing values, duplicates, and formatting inconsistencies. After cleaning the dataset, we will conduct exploratory data analysis as a team to validate our initial hypotheses and extract meaningful insights together.
How can insurance companies accurately assess the risk of shark attacks to offer appropriately priced insurance premiums for individuals and ocean activity businesses, such as surf and dive schools?
By using exploratory data analysis, we can accurately classify individuals and businesses into risk categories (low, medium, high) based on a points system to price insurance premiums more appropriately.