Stars
Collection of Taiwan Rental House Data from Public Website
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.
AWS Blog (Somatic Variant Calling )
SDK for GPU accelerated genome assembly and analysis
Memory for AI Agents; Announcing OpenMemory MCP - local and secure memory management.
A machine learning toolkit for log parsing [ICSE'19, DSN'16]
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Open source hyperconverged infrastructure (HCI) software
Kubebuilder - SDK for building Kubernetes APIs using CRDs
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
The Web framework for perfectionists with deadlines.
Data Science Roadmap from A to Z
fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
Snakemake profile for running jobs on an LSF cluster
A parallel implementation of gzip for modern multi-processor, multi-core machines.
Genotype dimension reduction research. Code for manuscript "UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts"
hail-based pipelines for annotating variant callsets and exporting them to elasticsearch
Hail helper functions for the gnomAD project and Translational Genomics Group
Official code repository for GATK versions 4 and up
Cloud-native genomic dataframes and batch computing
Official git repository for Biopython (originally converted from CVS)
A structural variation pipeline for short-read sequencing
Repository for the CWL standards. Use https://cwl.discourse.group/ for support 😊
A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin