Phonetic search algorithms in Python

This repository contains a few phonetic search / indexing algorithms implemented in Python.

Moreover this repository also contains two corpus files.

names.csv is a list of first and last names collected from the 1990 US census and contains 155.947 unique names.

badwords.csv is a collection of English swearwords collected online. Words have not been checked for offensiveness or correctness.

Sources: Consist mostly of words from noswearing.com and Google's official list of bad words

Both corpora are considered public domain, and free to use.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
badwords.csv		badwords.csv
double_metaphone.ipynb		double_metaphone.ipynb
names.csv		names.csv
phonix.py		phonix.py
phonix_implementation.ipynb		phonix_implementation.ipynb

Provide feedback