8000 SamTaylor92 (Sam Taylor) Β· GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View SamTaylor92's full-sized avatar

Block or report SamTaylor92

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SamTaylor92/README.md

Sam Taylor

StackOverflow Github LinkedIn Signal Email

About

Hi, I'm Sam!

I am a Senior Operations Analyst, based in Berlin, with a background in the travel industry and education. I hold a Bachelor's degree (B.A.) in Modern Languages (Spanish & English) and a Postgraduate Certificate in Education (PGCE), specialising in Secondary Education (Spanish & French). During my 8 years in the travel industry, with GetYourGuide, I have worked as a Customer Service Agent, a Team Lead and a Quality Assurance Manager before transitioning over to an Operations Analyst in November 2021.

This repository is to showcase skills, share projects and track my journey in Data Analytics.

Languages:

SQL Python

Tools:

Jupyter Notebook Looker Databricks Excel Google Sheets Visual Studio Code Atom Pandas matplotlib seaborn scipy statsmodels

Table of contents

Portfolio Projects

Below are projects I've worked on.

πŸ“Š [March 2025] Project: Marketing A/B Test

Date: Q1 2025 Repository: [Link] Notebook: [Link] PDF Slides:[Link] Google Slides:[Link]

Jupyter Notebook Python statsmodels Pandas Matplotlib Seaborn

Data engineer project end-to-end process in GIF form

Description:

This project analyzes the effectiveness of a marketing campaign using A/B testing.

The goal is to determine whether ads significantly impact customer conversion rates. The analysis includes statistical methods such as attribution percentage and odds ratio calculations.

The data is sourced from Kaggle

Results: A marketing analysis of the provided dataset summarised in PDF form

Skills: Data analysis | ETL | Data Pipeline | Data cleaning | Descriptive statistics | Statistical analysis | Data visualization | A/B Testing

⬆

βš™οΈ [August 2024] Data Engineer Project: Dead by Daylight ETL Pipeline

Date: Q3 2024 Repository: [Link] Notebook: [Link] PDF Slides:[Link] Project Documentation:[Link]

Jupyter Notebook Databricks Python Spark statsmodels Scikit-learn Pandas Matplotlib Seaborn

Data engineer project end-to-end process in GIF form

Description:

This project focuses on building an ETL pipeline for Dead by Daylight game data.

The pipeline processes data from various game entities such as characters, perks, maps, addons, and match detail and sorts this data into relvant tables in a databse.

Then, using python, we designed a pipeline to transform, clean, and load data into structured formats that support game balancing analysis, player performance tracking, and game element ratings.

The data is sourced from Dennis Reep's Dead by Daylight website via scraping (with permission), and integrated into a database designed to store, manage, and analyze key elements for decision-making around the game.

Results: A comprehensive analysis on the Dead by Daylight game summarised in PDF form

Skills: Data engineering | ETL | Data Pipeline | Data Scraping | Data cleaning | Data analysis | Descriptive statistics | Statistical analysis | Data visualization

⬆

✍️ [October 2023] 9-Part Data Analysis Tutorial for Beginner Analysts

Date: Q3 2023 Repository: [Link] Notebook: [Link] PDF Slides:[Link] Blog Posts:[Link]

Jupyter Notebook Python statsmodels Scikit-learn Pandas Matplotlib Seaborn

Screenshot 2024-08-04 at 11 12 24

Description:

A comprehensive 9-part guide on analysing a dataset using Python (Pandas) and VS Code. Perfect for beginner analysts looking to enhance their data analysis portfolio. Written on Medium.com as a series of blog posts.

πŸ” The series covers:

  • Defining Objectives
  • Data Acquisition
  • Data Exploration
  • Data Cleaning
  • Data Visualization
  • Feature Engineering
  • Statistical Analysis
  • Machine Learning
  • Presenting Solutions

Each step is broken down with practical examples and code snippets, making it easy for beginners to follow along and learn.

Results: A 9-part blog series to help aspiring data analysts prepare a portfolio piece.

Skills: Data cleaning | Data analysis | Descriptive statistics | Statistical analysis | Machine learning | Data visualization

⬆

πŸ’Ό [May 2021] Company Sales and Operations Analysis

Date: Q2 2021 Repository: [Link] Notebook: [Link] PDF Slides:[Link]

Python Pandas Jupyter Notebook Google Sheets

presentation_gif

Description:

The dataset contains ~100k records of a company's sales, customer, operational and product data. The project involved: data loading, data cleaning, preprocessing, filling missing values, exploratory data analysis, measuring statistical factors, hypothesis testing.

Results: Data-based business recommendations for the company.

Skills: Data cleaning | Data analysis | Descriptive statistics | Data visualization

⬆

Learning Projects

Below are projects worked on for online courses.

🐼 [May 2022] The Complete Pandas Bootcamp 2022 Data Science with Python

Date: Q2 2022 Duration: 35 hours Repository:[Link] Notebooks: [Data aggregation] | [Data analysis]

Python Jupyter Notebook Pandas matplotlib seaborn scipy

olympics_heatmap

Description:

A course aimed at learning to use Pandas (Python library) for data aggregation and data analysis.
There were two capstone projects, one for each core skill (data aggregation & exploratory data analysis).

Skills: Data cleaning | Data analysis | Descriptive statistics | Data visualization | Statistics | Machine learning | Time series

⬆

Side Projects

Below are a collection of non-data-analytic projects that I have been working on.

πŸš€[May 2022] Google sheets: split a column containing multiple email addresses into 1 row per email address

Date: Q2 2022 Repository: [Link]

Excel Google Sheets Google Forms

email splitter

Description:

A Google sheets formula to speed up a recurring process. The formula allows users to input more than one email address in the receiver field of a Google Form, splits out the data and duplicates the rows in the back end (1 duplication per email entered into the receiver field). This repository documents this project.

Skills: Spreadsheet formulas | Google suite

⬆

πŸš€ [May 2022] Google sheets: conditional second dropdown menu

Date: Q2 2022 Repository: [Link]

Google Sheets Google Apps Script

conditional dropdown

Description:

A Google Apps Script to speed up a repetitive, manual process and reduce input errors. The script creates two dropdown menus in a Google Sheet, where the second dropdown menu is dependant on the input of the first dropdown menu. This repository documents this project.

Skills: Google Apps Script | Google suite

⬆

Courses

Although not a replacement for on-the-job experience or project work, here are some of the courses I have completed over the years.

πŸ“Š [Aug 2024] Practical Database Design

Organisation: Udemy Duration: 1 month Credential: [Link] Repository: [Link]

Description:
  • Build a database design from a given set of requirements
  • Determine a set of prelimiary entities and attributes to start a database design
  • Normalise a database design into 1NF taking into consideration multivalued and miltipart fields
  • Establish table candidate and primary keys
  • Normalise a database design into 2NF taking into consideration partial key dependencies
  • Identify multiple types of table relationships and define relationships between tables
  • Normalise a database design into 3NF taking into consideration transitive dependencies
  • Develop database design solutions to common features of a blog application

⬆

🐍 [Mar 2023] Automate the Boring Stuff with Python Programming

Organisation: Udemy Duration: 1 month Credential: [Link] Repository: [Link]

Description:
  • Automate tasks on their computer by writing simple Python programs.
  • Write programs that can do text pattern recognition with "regular expressions".
  • Programmatically generate and update Excel spreadsheets.
  • Parse PDFs and Word documents.
  • Crawl web sites and pull information from online sources.
  • Write programs that send out email notifications.
  • Use Python's debugging tools to quickly figure out bugs in your code.
  • Programmatically control the mouse and keyboard to click and type for you.

⬆

πŸ“‰ [Nov 2022] Python for Time Series Data Analysis

Organisation: Udemy Duration: 1 month Credential: [On going] Repository: [Link]

Description:
  • Pandas for Data Manipulation
  • NumPy and Python for Numerical Processing
  • Pandas for Data Visualization
  • How to Work with Time Series Data with Pandas
  • Use Statsmodels to Analyze Time Series Data
  • Evaluate a model's efficiency by comparing training and test data
  • Use Facebook's Prophet Library for forecasting
  • Understand advanced ARIMA models for Forecasting

⬆

πŸ“ˆ [Oct 2022] Statistics for Data Science and Business Analysis

Organisation: Udemy Duration: 3 months Credential: [Link]

Description:
  • Understand the fundamentals of statistics
  • Learn how to work with different types of data
  • How to plot different types of data
  • Calculate the measures of central tendency, asymmetry, and variability
  • Calculate correlation and covariance
  • Distinguish and work with different types of distributions
  • Estimate confidence intervals
  • Perform hypothesis testing
  • Make data driven decisions
  • Understand the mechanics of regression analysis
  • Carry out regression analysis
  • Use and understand dummy variables
  • Understand the concepts needed for data science even with Python and R

⬆

🐼 [Jun 2022] The Complete Pandas Bootcamp 2022: Data Science with Python

Organisation: Udemy Duration: 6 months Credential: [Link]

Description:
  • Bring your data handling & data analysis skills to an outstanding level.
  • Master a complete machine learning project A-Z with Pandas, Scikit-Learn, and Seaborn
  • Practice and master your Pandas skills with quizzes, 150+ exercises, and comprehensive projects
  • Learn and master the most important Pandas workflows for finance
  • Learn the basics of Pandas and Numpy coding
  • Learn and practice all relevant Pandas methods and workflows with real-world datasets
  • Import, clean, and merge messy data and prepare data for machine learning
  • Analyze, visualize, and understand your data with Pandas, Matplotlib, and Seaborn
  • Import financial/stock data from web sources and analyze them with Pandas
  • Learn and master important statistical concepts with scipy

⬆

πŸ’» [Apr 2022] SQL Fundamentals Track

Organisation: DataCamp Duration: 21 hours Credential: [Link]

Description:
  • Introduction to SQL
  • Joining data in SQL
  • Intermediate SQL
  • PostgresSQL summary stats and window functions
  • Functions for Manipulating Data in PostgreSQL

⬆

πŸ¦ΈπŸΌβ€β™‚οΈ [Jan 2022] The Complete SQL Bootcamp 2022: Go from Zero to Hero

Organisation: Udemy Duration: 9 hours Credential: [Link]

Description:
  • SQL statement fundamentals (Select, Count, Where, Order by, Limit, In, (I)like)
  • Group by statements (Group by, Having)
  • Joins (As statement, Inner joins, Full outer joins, Left outer joins, Right joins, Union)
  • Advanced SQL commands (Timestamps, extract, mathematical functions, string functions, subquery, self-join)
  • Creating databases and tables (data types, primary & foreign keys, constraints, create table, insert, update, delete, alter table, drop table, check constraint)
  • Conditional expressions and procedures (case, coalesce, cast, nullif, views, import, export)

⬆

πŸ‘¨πŸΌβ€πŸ’Ό [Jun 2018] Management and Leadership: Growing as a Manager

Organisation: The Open University Business School Duration: 4 weeks Credential: [Link]

Description:

The course offers participants an introduction to the foundation skills and knowledge of a middle manager and leader. The learning activities begin the process of preparing the learner for the Chartered Management Institute (CMI) qualifications in Management and Leadership at Level 5. It introduces them as experienced practitioners to the underpinning theory of management and leadership. The course was prepared by The Open University Business School (AMBA, EQUIS, AACSB triple-accredited)

⬆

Reference material

A list of useful reference material.

⬆



Β© 2022 GitHub, Inc. Terms Privacy

Popular repositories Loading

  1. SamTaylor92 SamTaylor92 Public

    Config files for my GitHub profile.

  2. The-Complete-Pandas-Bootcamp-2022-Data-Science-with-Python The-Complete-Pandas-Bootcamp-2022-Data-Science-with-Python Public

    Project work for the course: The Complete Pandas Bootcamp 2022: Data Science.

    Jupyter Notebook

  3. Q2-2022-Split-Google-Form-responses-1-row-per-email-address Q2-2022-Split-Google-Form-responses-1-row-per-email-address Public

    [Q2 2022] Project

  4. Q2-2022-sample-business-dataset Q2-2022-sample-business-dataset Public

    [Q2 2022] Sample business dataset analysis

    Jupyter Notebook

  5. Google-Sheets-conditional-second-dropdown Google-Sheets-conditional-second-dropdown Public

    A google sheets project to create a drop down data validation that is conditional to another column's input

  6. Python-for-time-series-data-analysis Python-for-time-series-data-analysis Public

    Repository for the course: "Python for time series data analysis"

    Jupyter Notebook

0