setup-env.sh

Use those scripts to download the 10M movielens dataset and index them in Elasticsearch:

setup-env.sh

Takes care of setting up an environment that can be used to index movielens data. Fetches:

Movielens 10M dataset

hetrec dataset to augment movielens

Elasticsearch 1.4* for indexing

Marvel (+ installs Marvel into Elasticsearch) for query exploration

kibana for data exploration

post_movies.py

usage: post_movies.py [-h] [--lens lens] [--clear clearance] [--stop clearonly] Parse movielens formatted information and post message therein to a running elasticsearch instance. optional arguments: -h, --help show this help message and exit --lens lens Path to movielens directory in local filesystem. --clear clearance Set to "true" to clear the existing index before re-indexing. --stop clearonly Only clear index, do not add more documents.

Index names used:

movies - movie information

ratings - for each rating information on the user, rating value and title of the rated movie

tags - tags with timestamp and user information

post_movie_details.py

usage: post_movie_details.py [-h] [--datadir datadir] [--clear clearance] [--stop clearonly] Parse hetrec formatted information and post details therein to a running elasticsearch instance. Index used: movie_details optional arguments: -h, --help show this help message and exit --datadir datadir Path to data directory in local filesystem. --clear clearance Set to "true" to clear the existing index before re-indexing. --stop clearonly Only clear index, do not add more documents.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
LICENSE		LICENSE
README.md		README.md
post_movie_details.py		post_movie_details.py
post_movies.py		post_movies.py
setup-env.sh		setup-env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

setup-env.sh

post_movies.py

post_movie_details.py

About

Uh oh!

Releases

Packages

Languages

License

MaineC/recsys

Repository files navigation

recsys