I am a bioinformatician, scalable operations architect, Scientist, Artist, software engineer, and systems thinker working at the intersection of biology, clinical genomics, data science & cutting-edge computational solutions.
🔬 Scientist: Passionate about unraveling biological complexity.
💻 Open Source Advocate: Building tools to accelerate discovery and collaboration.
🌱 Innovator: Driving sustainable, impactful solutions in informatics and beyond.
🏗️ Builder: Contributor to the 🧬 Human Genome Project and several successful 🏥 clinical diagnostic startup companies.
- 🌐 Daylily Informatics Consulting: dyly.bio
- 💼 LinkedIn: john--major
- 🧪 Consulting Projects: Daylily-Informatics GitHub
- 🔧 Personal Repositories: iamh2o GitHub
🧬 Bioinformatics: Clinical WGS, RNA-seq, and variant analysis.
📊 Data Science: Python, R, machine learning.
☁️ HPC & Cloud Computing: AWS, Slurm, high-performance computing.
🌐 Open Source Development: FastAPI, Snakemake, and more.
🏥 Clinical Diagnostic Operations: designing and running scalable diagnostic workflows.
⚖️ Clinical Diagnostic Regulation & Compliance: expertise in CLIA/CAP standards and certification processes.
daylily-ephemeral-cluster: Infrastructure as code allows on-demand creation of arbitrarily large self-scaling clusters.
Features:
- Built using AWS Parallel Cluster and Parallel Cluster UI.
- Scans AWS Regions and AZs to determine best spot market pricing, and creates clusters where spot pricing is most competitive.
- Highly performant globally shared filesystem via FSx Lustre mirroring reference and other data from
S3
. - Reproducible and predictable runtimes and costs.
- Automateable.
- Fine resolution budget tracking of jobs and resources.
- Real time cost reporting and decision gating capabilities.
- Will run any slurm based workflow manager ( snakemake, CROMWELL, nextflow, ...).
- Tight coupling of reference data allows highly performant and nimble ephemeral cluster lifecycles.
- Designed to be ephemeral-- packaged tools facilitate rapid creation. monitoring, updating, archiving and deleting of ephemeral clusters.
daylily-omics-analysis: Achieving ~$2–$5 per 30x no-amp WGS from FASTQ to VCF.
Features:
- Optimized to run w/in a daylily-ephemeral-cluster framework.
- Industry-leading accuracy, speed, cost, auditability, scalability, QC views & observability.
- Reproducible, sustainable, growing & open-source omics analysis workflows.
- Automated infrastructure management with predictive and real-time cost visibility for storage, data transfer, and compute.
- Open source & free: Deploy daylily in ~1hr and begin returning completed WGS analysis shortly thereafter.
snakemake-executor-plugin-pcluster-slurm
A plugin designed to integrate Snakemake workflows with AWS ParallelCluster’s Slurm workload manager.
bloom:
A templated, abstract, polymorphic, and opinionated LIMS for efficient laboratory data management. Real time COGS moitoring and operational decision gating.
zebra_day:
A library and API for network-connected Zebra printers, managing ZPL label templates and numerous printers with ease.
- xcures_day: Python wrapper for the Xcures REST API.
- fedex_tracking_day: Python module for programmatic shipment tracking.
- img_stitcher_day:
Tool for merging images of tube exteriors for analysis.
- rgbw_colorspace_converter:
Utility for RGB to RGBW conversion, supporting HSV, HSI, HSL, and HEX, focused on LED-based projects.
- gravity_well:
Imports text, markdown, and PDF files into Obsidian with NLP-derived tags and enhanced metadata tracking.
- pyramidtriangles:
Software for artistic LED installations, derived from the grgbrn baaahs2014 codebase.
👀 Explore more in my repositories or get in touch!< 8000 /p>
I’m committed to contributing to the global scientific community by creating tools and sharing knowledge. Let’s collaborate to push the boundaries of what’s possible in science and technology.
👀 Interested in what I’m building? Follow, star, or get in touch.