Import open IMDB datasets into Neo4J.
This uses the neo4j-admin import tool which requires a blank database. This works best with a clean install of Neo4J on a high-memory instance. This is tested on an instance running Ubuntu 18.04.
I created this to be able to see which actors/actresses were in common between various shows, but the data is limited as it only has the "principals" for each episode, not the entire cast. Nevertheless, it is still possible to see some patterns, particularly when looking at the lead characters of shows/films.
To setup (each setup will reset the database):
./download.sh ./install.sh
To query:
cypher-shell
Types: Person, Title
Relationships: (:Person)-[:APPEARED_IN]->(:Title)
, (:Title)-[:EPISODE_OF]->(:Title)
- Pre-process TSV to only include actors and actresses
- Pre-build some queries