A visual tool for constructing LLM (Large Language Model) training components and generating PyTorch code.
- Visual Component Builder: Drag and drop LLM components to create your architecture
- PyTorch Code Generation: Generate ready-to-use PyTorch code from your visual design
- Component Library: Access embeddings, positional encodings, QKV blocks, and more
- Optimization Options: Configure training optimizations like FSDP, Flash Attention, MoE, and more
- Training Hyperparameters: Fine-tune batch size, learning rate, model dimensions, and more
- Device Detection: Automatically detect and use the best available hardware (CUDA, MPS, CPU)
- Experiment Runner: Run small-scale experiments with synthetic data to test your model
- Embedding Layers: Convert token IDs to embeddings
- Positional Encodings: Add position information to embeddings (Sinusoidal, Learned, Rotary, ALiBi)
- Multi-Head Attention: Self-attention mechanisms with configurable parameters
- Feed Forward Networks: Process features with non-linearity
- Output Layers: Final projection layers with various activation functions
-
Training Hyperparameters:
- Batch size, block size (context length), and maximum iterations
- Learning rate and evaluation intervals
- Model architecture parameters (embedding dimension, number of heads/layers)
- Dropout rate for regularization
-
Distributed Training:
- Fully Sharded Data Parallel (FSDP) with configurable sharding strategies
- DeepSpeed ZeRO with CPU offloading options
-
Mixture of Experts (MoE):
- Configure number of experts and routing strategy
- Set top-k experts per token (Switch Transformers for k=1, standard MoE for k=2)
- Adjust capacity factors for training and evaluation
- Enable expert parallelism for multi-GPU setups
- Control expert dropout for better generalization
-
Attention Optimizations:
- Flash Attention for faster, memory-efficient attention
- xFormers memory-efficient attention mechanisms
-
Memory Optimizations:
- Gradient checkpointing to reduce memory usage
- Mixed precision training (FP16/BF16)
-
Compilation:
- PyTorch 2.0 torch.compile() with different compilation modes
-
Device Detection:
- Automatic detection of CUDA GPUs
- Support for Apple Silicon GPUs via Metal Performance Shaders (MPS)
- Fallback to CPU when no GPU is available
-
Experiment Features:
- Run small-scale experiments with synthetic data
- Configure batch size, epochs, and sequence length
- Track and visualize training metrics (loss, timing)
- Save model checkpoints during training
- Node.js 18+ and npm
- Clone the repository:
git clone https://github.com/your-username/llm-graph-trainer.git
cd llm-graph-trainer
- Install dependencies:
npm install
- Run the development server:
npm run dev
- Open http://localhost:3000 in your browser.
- Navigate to the Builder page
- Drag components from the left panel onto the canvas
- Connect components by dragging from one node's output handle to another node's input handle
- Configure component parameters by clicking on them
- Go to the Optimizations tab to configure training optimizations
- Configure device detection and experiment settings in the Experiment tab
- Click "Generate Code" to create PyTorch code for your model
- Copy or download the generated code for use in your PyTorch projects
The generated code includes functionality to run small-scale experiments with your model:
-
Configure experiment settings in the Experiment tab:
- Set batch size, epochs, and sequence length
- Enable metrics tracking and checkpoint saving
- Configure synthetic dataset size
-
The generated code will include a
run_experiment()
function that:- Automatically detects the best available device (CUDA, MPS, CPU)
- Generates synthetic data for training
- Trains the model for the specified number of epochs
- Tracks and visualizes training metrics
- Saves model checkpoints
-
Run the generated Python code:
python your_model.py
- View the results in the
experiment_results
directory:- Training loss plots
- Performance metrics
- Model checkpoints
The LLM Graph Trainer includes a comprehensive test suite to ensure the application works as expected. The tests focus on verifying that:
- The synchronization between the nodes array and the selectedNode state works correctly
- Parameter updates from the properties panel are reflected in the node data
- Changes to nodes from other sources (like validation) are reflected in the properties panel
- Multiple parameter updates in sequence are handled correctly
To run the tests, first install the dependencies:
npm install
Then run the tests using one of the following commands:
# Run tests in watch mode
npm test
# Run tests with UI
npm run test:ui
# Run tests with coverage
npm run test:coverage
The tests are organized into several files:
FlowEditor.test.tsx
: Tests for the main FlowEditor componentNodeProperties.test.tsx
: Tests for the NodeProperties componentNodeSynchronization.test.tsx
: Tests specifically for the node synchronization mechanismIntegration.test.tsx
: Integration tests between FlowEditor and NodeProperties
-
State Synchronization: Tests verify that when a node is updated through any means, both the nodes array and the selectedNode state are kept in sync.
-
Parameter Updates: Tests check that parameter changes in the properties panel are correctly applied to the node data.
-
Conditional Rendering: Tests ensure that conditional UI elements (like the MoE settings when useMoE is enabled) appear and disappear correctly.
-
Multiple Updates: Tests confirm that multiple parameter updates in sequence are all applied correctly.
- Next.js
- React
- TypeScript
- Tailwind CSS
- Shadcn UI
- React Flow
- Monaco Editor
This project is licensed under the MIT License - see the LICENSE file for details.
- Inspired by the need for easier LLM architecture experimentation
- Built with modern web technologies for a smooth user experience