8000 GitHub - DummuMounika/Ecommerce-Sales-Categorization: This repository contains Python code for text classification and analysis of e-commerce sales data. The script processes textual descriptions of products and categorizes them into predefined categories using a Naive Bayes classifier. It also includes various analysis and visualization methods to explore the dataset.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

This repository contains Python code for text classification and analysis of e-commerce sales data. The script processes textual descriptions of products and categorizes them into predefined categories using a Naive Bayes classifier. It also includes various analysis and visualization methods to explore the dataset.

Notifications You must be signed in to change notification settings

DummuMounika/Ecommerce-Sales-Categorization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ecommerce-Sales-Categorization

Python script for text classification and analysis of e-commerce sales data.

Overview

This repository contains a Python script that processes textual descriptions of products from an e-commerce dataset and categorizes them into predefined categories using a Naive Bayes classifier. Additionally, the script provides various analysis and visualization methods to explore the dataset, including plotting category distribution, analyzing top customers, and visualizing sales by country and month.

Features

Text Classification: Utilizes a Naive Bayes classifier to categorize product descriptions.
Natural Language Processing (NLP): Preprocesses text data using tokenization and lemmatization and filtering out invalid words.
Analysis and Visualization: Provides insights into the dataset through various analysis and visualization methods.
Error Handling: Handles file loading errors and unexpected errors during execution.

Usage

  1. Ensure Python and required libraries are installed.
  2. Clone this repository to your local machine.
  3. Prepare the training dataset in CSV format with 'Description' and 'Category' columns.
  4. Run the script(text_classifier.py), provide the necessary file paths as arguments to input files.
  5. Explore the output results such as predicted_categories.json and predicted_categories.csv files.
  6. Analyze the results and visualizations generated by the script.

Acknowledgements

  1. scikit-learn: Library for machine learning in Python.
  2. NLTK: Toolkit for natural language processing.
  3. Matplotlib: Visualization library in Python.

Author

MOUNIKA DUMMU
MANEESH SETTIPETA
VIKRAM SAMUDRALA

About

This repository contains Python code for text classification and analysis of e-commerce sales data. The script processes textual descriptions of products and categorizes them into predefined categories using a Naive Bayes classifier. It also includes various analysis and visualization methods to explore the dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0