8000 GitHub - olsgaard/phonetic_search: Phonetic search algorithms in Python
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

olsgaard/phonetic_search

Repository files navigation

Phonetic search algorithms in Python

This repository contains a few phonetic search / indexing algorithms implemented in Python.

Unless otherwise noted, these are all (C) Copyright 2015, Mads Olsgaard, released under BDS 3


Moreover this repository also contains two corpus files.

  1. names.csv
  2. badwords.csv

names.csv is a list of first and last names collected from the 1990 US census and contains 155.947 unique names.

Source: http://www.census.gov/topics/population/genealogy/data/1990_census/1990_census_namefiles.html

badwords.csv is a collection of English swearwords collected online. Words have not been checked for offensiveness or correctness.

Sources: Consist mostly of words from noswearing.com and Google's official list of bad words

Both corpora are considered public domain, and free to use.

About

Phonetic search algorithms in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0