-
A.P. Møller - Mærsk A/S
- Copenhagen, Denmark
- https://www.linkedin.com/in/jmahenriques/
Stars
Faster way to switch between clusters and namespaces in kubectl
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Always know what to expect from your data.
A native Rust library for Delta Lake, with bindings into Python
PySpark test helper methods with beautiful error messages
Compare tables within or across databases
Poetry plugin for the asdf version manager [maintainer=@crflynn]
Official Implement of "ADBench: Anomaly Detection Benchmark", NeurIPS 2022.
The official Python client library for the Polygon REST and WebSocket API.
🦀 Small exercises to get you used to reading and writing Rust code!
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Bringing your code and work to the conversations you care about with the GitHub and Microsoft integration
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Iconic font aggregator, collection, & patcher. 3,600+ icons, 50+ patched fonts: Hack, Source Code Pro, more. Glyph collections: Font Awesome, Material Design Icons, Octicons, & more
🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…
🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.
Feature Extraction Package for Biological Sequences
🔆 A Python implementation of a sum-product network with gaussian processes leafs model (SPNGP, arXiv:1809.04400) 📃
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports comp…
A high performance implementation of HDBSCAN clustering.
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Get protein embeddings from protein sequences
OpenChem: Deep Learning toolkit for Computational Chemistry and Drug Design Research