Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Welcome to my GitHub repository. I hope you enjoy solving these puzzles as much as I have enjoyed creating them.
Example end to end data engineering project.
A fast-searching and space-saving browser specially designed for programmers.
A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin
A Curated List of Computational Biology Datasets Suitable for Machine Learning