8000 GitHub - Mike-Q/hadoop-pagerank: PageRank algorithm implementation which make use of the Apache Hadoop framework
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Mike-Q/hadoop-pagerank

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Hadoop PageRank

PageRank algorithm implementation which make use of the Apache Hadoop framework.

Execute the program

  • Install Hadoop on your machine [OSX], [Linux]
  • Pick a dataset from the Stanford web graphs collection
  • Place the dataset in your Hadoop FS
  • Create the directory which will contain the output
  • Build a JAR using this source code and name it pagerank.jar
  • Launch the software using Hadoop: hadoop jar pagerank.jar --input <in> --output <out>
  • Browse the PageRank output result which can be found in the Hadoop FS

Usage reference

  • --help (-h): display the help text
  • --damping (-d) : the damping factor [OPTIONAL] [DEFAULT = 0.85]
  • --count (-c) : the amount of iterations [OPTIONAL] [DEFAULT = 2]
  • --input (-i) : the directory of the input graph [REQUIRED]
  • --output (-o) : the directory of the output result [REQUIRED]

About

PageRank algorithm implementation which make use of the Apache Hadoop framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%
0