🌽 corpy
search engine optimizationA complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance base…
A basic search engine to index a corpus for searching and rank the document data set.
Various Indexing and Query Based Retrieval Models and Page-rank Algorithm in Python 3.0
Search Engine built using Flask, HTML, CSS and MongoDB using an inverted index (TF-IDF scoring).
Built a search engine from scratch for a Wikipedia corpus of over 21 million articles (85 Gb) to give search results within 4 seconds. Parsed Wikipedia pages into tokens by applying appropriate tec…
Open source Python package to produce word sketches inspired by Sketch Engine (to make reproducible analyses)