8000 GitHub - victorcarre6/reaction-optimization-tool
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

victorcarre6/reaction-optimization-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reaction Optimization Tool

This project aims to develop a platform for predicting chemical reaction yields by combining molecular structure information, experimental conditions, and machine learning models. It focuses on building a data-driven approach to support chemists in optimizing reactions efficiently.


Project Objectives

  • Data Collection, Cleaning, and Feature Engineering

    Gather a reaction dataset with multiple type of reactions (e.g. coupling reactions, catalyzed transformations, classic organic reactions…) Extract relevant data such as SMILES of reactants/products/catalysts/solvents, temperature, reaction time, and yields. Use RDKit to convert SMILES into molecular descriptors (Morgan fingerprints, logP, molecular weight, etc.).
    Apply one-hot encoding for categorical variables like solvents and catalysts.

  • Modeling

    Implement regression models first, followed by Neural Network modeling. Evaluate models using RMSE, R², and MAE metrics.

  • Visualization

    Analyze feature importance, cluster reactions by similarity, and compare model performances.

  • User Interface

    Develop a Streamlit application to predict the yield of new reactions interactively.


Current Status

Done:

  • Project Structure – Created folder architecture (data, src, notebooks, docs).
  • Sample Dataset – Added example reaction data (Buchwald-Hartwig Reaction) for initial testing.
  • Dataset Convertion – Created a script to easily convert .pb databases into .csv files.
  • Feature Extraction – Developed R 547E DKit-based scripts for molecular descriptor calculation.
  • Simple Models – Construction and training of regression models (Random Forest, Linear Regression)

Ongoing:

  • Proof of Concept – Using Buchwald-Hartwig amination reaction data to validate the approach.
  • Data Collection Scripts – Building scripts to gather and clean reaction datasets in bulk.
  • Model Development – Training initial machine learning models and tuning parameters.

Planned:

  • Advanced Visualizations – Implement clustering and detailed feature analysis.
  • Expanded Dataset – Incorporate other reaction types and experimental conditions.
  • User Interface – Build a web app for yield prediction using Streamlit.

Informations about installation and use of the repo will be provided as the project progresses.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0