8000 GitHub - PardhuKadali/flashtext: Extract Keywords from sentence or Replace keywords in sentences.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

PardhuKadali/flashtext

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlashText

Documentation Status license

This module can be used to replace keywords in sentences or extract keywords from sentences.

Installation

$ pip install flashtext

Usage

Extract keywords
>>> from flashtext.keyword import KeywordProcessor
>>> keyword_processor = KeywordProcessor()
>>> keyword_processor.add_keyword('Big Apple', 'New York')
>>> keyword_processor.add_keyword('Bay Area')
>>> keywords_found = keyword_processor.extract_keywords('I love Big Apple and Bay Area.')
>>> keywords_found
>>> ['New York', 'Bay Area']
Replace keywords
>>> keyword_processor.add_keyword('New Delhi', 'NCR region')
>>> new_sentence = keyword_processor.replace_keywords('I love Big Apple and new delhi.')
>>> new_sentence
>>> 'I love New York and NCR region.'
Case Sensitive example
>>> from flashtext.keyword import KeywordProcessor
>>> keyword_processor = KeywordProcessor(case_sensitive=True)
>>> keyword_processor.add_keyword('Big Apple', 'New York')
>>> keyword_processor.add_keyword('Bay Area')
>>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.')
>>> keywords_found
>>> ['Bay Area']
No clean name for Keywords
>>> from flashtext.keyword import KeywordProcessor
>>> keyword_processor = KeywordProcessor()
>>> keyword_processor.add_keyword('Big Apple')
>>> keyword_processor.add_keyword('Bay Area')
>>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.')
>>> keywords_found
>>> ['Big Apple', 'Bay Area']

API doc

Documentation can be found at FlashText Read the Docs.

Test

$ git clone https://github.com/vi3k6i5/flashtext
$ cd flashtext
$ pip install pytest
$ python setup.py test

Why not Regex?

It's a custom algorithm based on Aho-Corasick algorithm and Trie Dictionary.

To do the same with regex it will take a lot of time:

Docs count # Keywords : Regex flashtext
1.5 million 2K : 16 hours Not measured
2.5 million 10K : 15 days 15 mins

The idea for this library came from the following StackOverflow question.

Contribute

License

The project is licensed under the MIT license.

About

Extract Keywords from sentence or Replace keywords in sentences.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0