8000 GitHub - NolanT/phrase-finder
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

NolanT/phrase-finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Phrase Finder

Utility which returns a list of the 100 most common three word sequences (phrases) from a given text

Example Usage

one text file

python3 phrase_finder.py moby_dick.txt

multiple text files

python3 phrase_finder.py moby_dick.txt pride_and_prejudice.txt

stdin

cat pride_and_prejudice.txt | python3 phrase_finder.py 

wildcard text files

python3 phrase_finder.py *.txt

Script Requirements

  • The program accepts as arguments a list of one or more file paths (e.g. ./solution.rb file1.txt file2.txt ...).
  • The program also accepts input on stdin (e.g. cat file1.txt | ./solution.rb).
  • The program outputs a list of the 100 most common three word sequences.
  • The program ignores punctuation, line endings, and is case insensitive (e.g. “I love\nsandwiches.” should be treated the same as "(I LOVE SANDWICHES!!)"). Watch out that contractions aren't changed into 2 words (eg. shouldn't should not become shouldn t).
  • The program should be well structured and understandable.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0