Tools to construct and process Common Crawl webgraphs
-
Updated
May 29, 2025 - Java
8000
Tools to construct and process Common Crawl webgraphs
A sample application that demonstrates how to build a graph processing platform to analyze sources of emotional influence on Twitter.
Projects done in the Cloud Computing course.
Search Engine projects
A distributed algorithm applied to the bitcoin blockchain that allows to create a new representation of the transaction - a clusterized graph that combines all the addresses belonging to the same owner/organization.
Search Engine for Books (Java, Apache Lucene, crawler4j, Apache Spark)
A high-performance search engine that crawls, indexes, and ranks web content that supports Boolean query, phrase searching, and an attractive web interface
Link ranking with Apache Giraph for Apache Nutch
Coursework for CS550 : Massive Data Mining. Topics covered include Map-Reduce, Association Rules, Frequent Itemsets, Locality-Sensitive Hashing (LSH), Singular Value Decomposition (SVD), Page Rank, k-means, Modularity, Spectral Clustering, Clique-based communities, Clustering Data Streams.
Command line tool to compute PageRank scores over RDF graphs
Service providing a summarisation service for entities in RDF graphs
Applies Elasticsearch and Google's PageRank algorithm to search UML models
Use PageRank algorithm and InversePageRank to get the PageRank value and InversePageRank value of each website, and sort them from largest to smallest. Then select the number of normal websites and spam websites in the first N websites, and display them visually
Using MapReduce to calculate Wikipedia page rank; preventing dead-ends and spider-traps
pagerank hadoop
A page rank implementation on top of Neo4j
Add a description, image, and links to the pagerank topic page so that developers can more easily learn about it.
To associate your repository with the pagerank topic, visit your repo's landing page and select "manage topics."