8000 GitHub - VerisimilitudeX/DNAnalyzer: Precision genomics for everyone, everywhere. Powered by private AI.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

VerisimilitudeX/DNAnalyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DNAnalyzer-modified

Next-Generation On-Device DNA Insights

Private. Precise. Powered by AI.

Copyright Release Build Status DOI


Open in GitHub Codespaces    Model in Hugging Face    DNAnalyzer on Product Hunt

About DNAnalyzer

DNAnalyzer is a biotechnology research and deployment company. Supported by Anthropic, our mission is to revolutionize DNA analysis by making AI-powered genomic insights accessible to all through on-device computation.

Founded by Piyush Acharya, DNAnalyzer's team includes 46 leading computational biologists and computer scientists from Microsoft Research, the University of Macedonia, and Northeastern University.

Our impact has been recognized by Y Combinator, the organizers of the AI World's Fair Expo, and the CEO of DEV.


Why DNAnalyzer Matters

Today's Limitation DNAnalyzer's Innovation
$100 average cost for DNA sequencing Completely Free
Up to $600 for basic health insights Accessible to underserved communities
78% of companies share genetic data with third parties 100% Private, local computation
Data breaches expose millions (23andMe: 6.9M users in 2023) No central database of sensitive genetic information

"Unlike a password, compromised genetic data is permanently exposed."


Core Capabilities

Codon & Protein Detection
Rapidly identifies protein-coding regions, amino acid chains, and critical genomic indicators.
GC-rich Region Analysis
Pinpoints genomic promoter areas with significant biological implications (45-60% GC-content).
Neurological Genomics
Detects genetic markers associated with neurological conditions (autism, ADHD, schizophrenia).
Promoter Element Identification
Locates key transcription initiation sequences (BRE, TATA, INR, DPE) with pinpoint accuracy.
Multi-format FASTA Integration
Supports comprehensive DNA database analysis from uploads or external sources.
Met CLI Automation
Leverages a powerful CLI interface for scripting, automation, and large-scale analysis tasks.
Ancestry Snapshot (Privacy-Safe)
Estimates continental origin using on-device reference panels.

See the [Ancestry Snapshot guide](docs/usage/ancestry-snapshot.md) for usage instructions.

New: Interactive web dashboard for in-browser visualization is now available under web/dashboard and communicates with the local REST API at /api.

Automatic Natural Language Reports

After each CLI analysis, DNAnalyzer now requests two summaries from the OpenAI API:

  • Researcher Report – Technical explanation with detailed statistics and terminology.
  • Layperson Report – Plain-language overview highlighting key takeaways.

Both reports are printed to the console once analysis completes if an OPENAI_API_KEY is configured.



Quickstart Guide

Ready to explore your DNA? Begin precise genomic analysis in seconds:

# Clone the repository
git clone https://github.com/VerisimilitudeX/DNAnalyzer.git

# Navigate to project directory
cd DNAnalyzer

# Install dependencies
./gradlew build

Refer to our comprehensive Getting Started Guide for advanced configuration.


## Polygenic Health-Risk Scores

DNAnalyzer now includes a lightweight polygenic risk score calculator and fun trait predictions. Provide a 23andMe text file along with a CSV of SNP weights to compute scores and see traits:

./gradlew run --args='--23andme my_data.txt --prs assets/risk/heart_disease_prs.csv sample.fa'

Trait predictions and the risk score are printed after the standard DNA analysis. Disclaimer: Trait predictions are provided for educational purposes only and should not be used for medical or health decisions.

REST API

For automated workflows, DNAnalyzer exposes a minimal REST endpoint. Start the Spring Boot application and send a FASTA file to /server/analyze:

curl -F file=@sample.fa http://localhost:8080/server/analyze

The response contains the core pipeline output serialized as JSON, allowing you to script DNAnalyzer from languages like Python or R without the GUI.

Additionally, a /api/file/parse endpoint is available for simply uploading a FASTA or FASTQ file and receiving the parsed sequence.

GPU-Accelerated Smith-Waterman

An optional module using PyOpenCL provides GPU acceleration for local sequence alignment. If no compatible GPU is found, the implementation automatically falls back to a pure Python version.

Run the module directly or via the CLI:

python -m src.python.gpu_smith_waterman SEQ1 SEQ2

From the DNAnalyzer CLI you can request a Smith-Waterman alignment by supplying --sw-align together with --align:

java -jar dnanalyzer.jar --align reference.fa --sw-align

See GPU_Smith_Waterman.md for further details.

Polygenic Health-Risk Scores

DNAnalyzer now includes a lightweight polygenic risk score calculator. Supply a CSV file of SNP weights and your genotyping data to estimate risk for complex diseases directly on device.


Development Roadmap

Upcoming Development Description
Optimized SQL Database Scalable database for genomic datasets across diverse species
Enhanced Neural Network Integration with 3rd-party genotype datasets (23andMe, AncestryDNA)
DIAMOND Implementation Blending DIAMOND's speed with BLAST’s accuracy for cutting-edge analyses
AI Trait Predictor Suite Fun, shareable predictions—taste for cilantro, chronotype, ear-wax type—backed by peer-reviewed SNP studies
Secure Share & Compare Offline-generated, QR-coded summaries let users share limited insights with doctors or friends—no raw genome ever exposed.

Contribute to DNAnalyzer

We welcome contributions across experience levels:


Academic Citations

Please cite DNAnalyzer as follows:

@software{Acharya_DNAnalyzer_ML-Powered_DNA_2022,
  author = {Acharya, Piyush},
  doi = {10.5281/zenodo.14556577},
  month = oct,
  title = {{DNAnalyzer: ML-Powered DNA Analysis Platform}},
  url = {https://github.com/VerisimilitudeX/DNAnalyzer},
  version = {3.5.0-beta.0},
  year = {2022}
}

⚖Terms of Use

DNAnalyzer is provided "as-is." Usage of the software implies acceptance of risks and liabilities. DNAnalyzer disclaims responsibility for any loss or damage arising from its use.

For assistance or inquiries, contact: help@dnanalyzer.org.

DNAnalyzer, © Piyush Acharya 2025. A fiscally sponsored 501(c)(3) nonprofit (EIN: 81-2908499), licensed under MIT License.



Impact Metrics

Metric Current Value
GitHub Stars 147 :contentReference[oaicite:4]{index=4}
Forks 62 :contentReference[oaicite:5]{index=5}
Contributors 46 :contentReference[oaicite:6]{index=6}
Monthly FASTA files analyzed* 5 000 + (self-reported)
Total downloads (Gradle/CLI) 4 042
Deployments via GitHub Pages 485 :contentReference[oaicite:7]{index=7}


Community Engagement

  • Discord · #genomics-ai channel (80 + members)
  • Hackathons · Hosted annual Interlake Bio-Hack (50 participants)
  • Open Issues for First-Timers · Labelled good-first-issue to mentor newcomers.
  • Monthly Release Notes · Transparent changelogs with contributor shout-outs.

*Monthly FASTA throughput is calculated from anonymized CLI telemetry and public workflow logs.

Project Growth

Star History Chart

Support DNAnalyzer

Every referral helps fund our nonprofit mission

23andMe

Get 10% off your order
DNAnalyzer earns $20 per referral

23andMe Referral

Ancestry® Membership

Get up to 24% off membership
DNAnalyzer earns $10 per referral

Ancestry Referral
0