A neural attention mechanism inspired by the biological thalamus for information filtering and task-modulated processing.
The Synthetic Thalamus is a novel neural network module designed to selectively filter and gate information flow in a computational graph, similar to how the biological thalamus acts as a relay station in the brain. This repository implements a PyTorch-based Synthetic Thalamus with the following key features:
- Salience-based token gating using multihead attention
- Task conditioning to modulate attention patterns based on the current task
- Phase-aware processing through learned rotary-like embeddings
- Modality adapters for different input types (image, text)
- Feedback integration for salience adjustments based on rewards
The latest release includes a significantly improved phase generator with:
- Enhanced MLP Architecture with configurable depth and width
- Multiple Activation Functions (GELU, Leaky ReLU, SiLU, PReLU)
- Layer Normalization for better gradient flow
- Contrastive Learning to encourage similar phases for semantically related tokens
- Phase Diversity Parameter to control variation between token phases
- Analysis Tools for visualizing and quantifying phase relationships
The Synthetic Thalamus acts as an information bottleneck and attention mechanism. It:
- Takes a sequence of token features as input
- Scores each token's salience using multi-head attention, conditioned on a task embedding
- Selects the top-K most salient tokens
- Generates phase tags for each token to encode sequential/temporal information
- Outputs the gated tokens with their phase tags for further processing
This approach is inspired by Global Workspace Theory and the way the biological thalamus selectively relays information to the cortex based on attentional priorities.
The project includes an enhanced workspace implementation that leverages the phase tags more effectively:
- Phase Similarity Attention Bias: The attention mechanism is biased to favor tokens with similar phase signatures
- Learnable Phase Scale: The influence of phase similarity on attention is controlled by a learnable parameter
- Multi-layer Processing: Multiple transformer layers with phase-aware attention for deeper processing
- Biological Inspiration: Mimics neural oscillations in the thalamus that are thought to coordinate cortical processing
This enhanced workspace better utilizes the phase information, enabling:
- Temporal Binding: Helping the model determine which tokens should be processed together
- Rhythmic Synchronization: Acting as a "clock" signal that allows the workspace to align information streams
- Feature Grouping: Guiding the workspace to emphasize tokens that are in phase with each other
synth_thalamus/
├── core/
│ ├── __init__.py
│ ├── thalamus.py # SyntheticThalamus module
│ ├── adapters.py # Modality-specific encoders
│ ├── feedback.py # Feedback layer for salience adjustments
│ ├── enhanced_workspace.py # Phase-aware workspace
│ ├── phase_generator.py # Enhanced semantic phase generator
│ └── visualization.py # Phase visualization and analysis tools
├── examples/
│ ├── image_clf.ipynb # Demo for image classification
│ ├── babi_qa.ipynb # Demo for question answering tasks
│ ├── phase_similarity_demo.py # Demo for phase similarity attention
│ ├── ollama_phase_demo.py # Demo with Ollama embeddings
│ └── enhanced_phase_demo.py # Demo for enhanced phase generator
├── tests/
│ ├── test_gating.py # Unit tests for top-K gating
│ ├── test_phase.py # Unit tests for phase generation
│ ├── test_phase_attention.py # Tests for phase similarity attention
│ └── test_enhanced_phase.py # Tests for enhanced phase generator
├── train.py # LightningModule-based training entrypoint
├── IMPLEMENTATION_NOTES.md # Detailed implementation notes
└── README.md # This file
Once your virtual environment is activated, install the dependencies:
# Clone the repository
git clone https://github.com/angrysky56/synth_thalamus.git
cd synth_thalamus
It's recommended to use a virtual environment to avoid conflicts with other projects:
# Install virtualenv if you don't have it already
pip install virtualenv
This project is designed to work with Python 3.11 or newer. As of April 2025, Python 3.13 is the latest stable version (with Python 3.13.3 released on April 8, 2025), but Python 3.11 and 3.12 are also well-supported and have excellent compatibility with machine learning libraries including enhanced performance optimizations that improve dictionary operations, memory management, and list handling.
# Create a virtual environment
python3.12 -m venv venv # You can also use python3.11 or python3.13 if available
# Activate the virtual environment
# On Linux/Mac:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
- PyTorch (>=1.10.0)
- PyTorch Lightning (>=1.5.0)
- torchvision (for image examples)
- matplotlib (for visualization)
- numpy (>=1.20.0)
A minimal example of using the Synthetic Thalamus:
import torch
from core.thalamus import SyntheticThalamus
from core.enhanced_workspace import EnhancedWorkspace
# Initialize the thalamus with enhanced phase generator
thalamus = SyntheticThalamus(
d_model=128, # Token feature dimension
n_heads=4, # Number of attention heads
k=16, # Number of tokens to gate through
phase_dim=16, # Dimensionality of phase tags
task_dim=64, # Dimensionality of task conditioning
num_tasks=10, # Number of distinct tasks
phase_diversity=2.0, # Controls variation between phases
hidden_dims=[128, 64], # Hidden layer dimensions for phase generator
activation='gelu', # Activation function type
use_layer_norm=True # Use layer normalization
)
# Initialize the enhanced workspace
workspace = EnhancedWorkspace(
input_dim=128, # Content dimension (without phase)
hidden_dim=256, # Hidden dimension for feed-forward layers
output_dim=10, # Output dimension (e.g., number of classes)
nhead=4, # Number of attention heads
phase_dim=16, # Dimensionality of phase tags
num_layers=2, # Number of transformer layers
activation='gelu', # Activation function type
initial_phase_scale=1.0 # Initial scaling of phase similarity
)
# Example input: batch of 4 sequences, each with 100 tokens of dimension 128
tokens = torch.randn(4, 100, 128)
task_ids = torch.tensor([0, 1, 0, 2]) # Different tasks per batch item
# Forward pass through thalamus
gated_tokens = thalamus(tokens, task_ids)
print(gated_tokens.shape) # Output: [4, 16, 144] (batch_size, k, d_model + phase_dim)
# Forward pass through workspace
output, pooled = workspace(gated_tokens)
print(output.shape) # Output: [4, 10] (batch_size, output_dim)
print(pooled.shape) # Output: [4, 128] (batch_size, d_model)
# Access attention weights for visualization
attention_weights = workspace.attention_weights
You can also use the demo scripts to visualize phase-related functionality:
# Standard phase similarity demo
python examples/phase_similarity_demo.py
# Demo using Ollama embeddings (requires Ollama installed)
python examples/ollama_phase_demo.py
# Demo of the enhanced phase generator
python examples/enhanced_phase_demo.py
You can train a model with the Synthetic Thalamus using the provided PyTorch Lightning module:
# Train with the standard workspace
python train.py --max_epochs=10 --gpus=1
# Train with the enhanced workspace (phase similarity attention)
python train.py --max_epochs=10 --gpus=1 --enhanced_workspace
The enhanced workspace leverages phase similarity to bias attention, potentially leading to better performance on tasks that require temporal binding or feature grouping.
The examples/enhanced_phase_demo.py
script demonstrates the capabilities of the enhanced semantic phase generator:
- Activation Function Comparison: Visualizes the impact of different activation functions on phase patterns
- Diversity Parameter Sweep: Shows how the diversity parameter affects phase variation and similarity
- Network Architecture Analysis: Compares different MLP architectures for phase generation
- Category Similarity Analysis: Quantifies how well the phases capture semantic categories
The examples/image_clf.ipynb
notebook demonstrates using the Synthetic Thalamus for image classification tasks. It:
- Processes images into patch tokens
- Uses the thalamus to select the most salient patches
- Applies phase embeddings for spatial encoding
- Feeds the gated tokens to a simple classifier
The examples/babi_qa.ipynb
notebook shows the Synthetic Thalamus applied to bAbI question answering tasks. It:
- Encodes story tokens using a bidirectional LSTM
- Uses the question as context for the thalamus
- Gates the most relevant story tokens based on the question
- Processes the gated tokens with a transformer workspace
The Synthetic Thalamus can be customized in several ways:
- Adjust the bottleneck: Change
k
to control how many tokens pass through - Task conditioning: Use different task embeddings to change attention patterns
- Phase embeddings: Modify phase dimensions to encode different temporal patterns
- Scoring mechanism: Replace the default attention-based scorer with custom logic
- Enhanced features: Configure the phase generator with custom architectures, activation functions, and diversity parameters
The Synthetic Thalamus is particularly well-suited for:
- Multi-task learning: Sharing parameters while maintaining task-specific processing
- Efficient computation: Focusing resources on the most salient tokens
- Modular systems: Building components that communicate through a shared workspace
- Attention-driven processing: Implementing dynamic, priority-based information flow
Run the unit tests to verify functionality:
cd tests
python test_gating.py
python test_phase.py
python test_phase_attention.py
python test_enhanced_phase.py # Tests for the enhanced phase generator
The test_phase_attention.py
test verifies that the phase similarity attention bias works as expected, showing how tokens with similar phase values attend more strongly to each other.
To visualize how the phase similarity attention works, run the demo script:
python examples/phase_similarity_demo.py
This will generate visualizations comparing standard and enhanced workspace attention patterns, showing how the phase tags influence attention in the enhanced workspace.
For embeddings using Ollama, first install the client:
pip3 install ollama
Then pull a model (e.g., mxbai-embed-large or phi4-mini):
ollama pull mxbai-embed-large:latest
Run the Ollama phase demo:
python examples/ollama_phase_demo.py
The project now includes experimental support for feedback loops between the workspace and thalamus, inspired by biological thalamo-cortical circuits:
- WorkspaceToThalamusFeedback: Enables the workspace to influence thalamus processing
- RecurrentThalamusWorkspace: Creates a recurrent circuit that iteratively refines attended information
- HierarchicalThalamus: Implements stacked thalamus layers with feedback connections
- CrossModalFeedback: Enables information sharing between different modalities
To try the feedback mechanism, run:
python examples/recurrent_thalamus_demo.py
This will demonstrate how feedback from the workspace back to the thalamus can refine attention over multiple iterations, leading to more coherent information selection.
The enhanced phase generator now supports contrastive learning to improve semantic differentiation in phase space:
- Hard Negative Mining: Focuses contrastive learning on the most challenging examples
- Temperature Scheduling: Dynamically adjusts the contrastive loss temperature parameter
- Adaptive Loss Weighting: Balances task and contrastive objectives based on gradient statistics
- Stratified Batch Sampling: Ensures balanced category representation for effective training
To explore these advanced contrastive learning features, run:
python examples/advanced_contrastive_training.py --no_comparison
Epoch 20 Training: 96%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 24/25 [00:01<00:00, 17.03it/s]Phase vectors shape: torch.Size([32, 16, 16])
Semantic categories shape: torch.Size([32, 16])
Final shapes: Phase vectors torch.Size([32, 16, 16]), Categories torch.Size([32, 16])
Similarity shape: torch.Size([32, 16, 16])
Positive mask shape: torch.Size([32, 16, 16])
Epoch 20 Training: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:01<00:00, 17.15it/s]
Validation: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 410.52it/s]
Train loss: 0.4631, Task loss: 0.0004, Contrastive loss: 0.4641
Contrastive weight: 0.9969, Temperature: 0.1220
Val loss: 0.0004, Val accuracy: 1.0000
Model saved to advanced_contrastive_results/model_epoch_20.pt
Gradient statistics saved to advanced_contrastive_results/grad_stats_epoch_20.png
Training completed in 0.57 minutes
Phase vectors shape: torch.Size([16, 16])
Categories shape: torch.Size([32])
Similarity matrix shape: (16, 16)
Unique categories: [0 1 2 3]
Category 0 has 8 tokens
Category 1 has 8 tokens
Category 2 has 8 tokens
Category 3 has 8 tokens
Phase visualizations (after_training) saved to advanced_contrastive_results
Category Contrast Metrics:
Before training: -0.0079
After training: -0.0010
Improvement: 0.0068
While the project now includes many advanced features, there are still opportunities for further research:
- Integration with reinforcement learning for adaptive salience adjustment
- Further development of hierarchical thalamus structures
- More sophisticated phase encoding for recurrent processing
- Extended cross-modal gating for complex multi-modal tasks
This project was inspired by recent neuroscience research on the role of the thalamus in consciousness. Specifically, the work draws on findings from an April 2025 study by Zepeng Fang and colleagues published in Science, which discovered that specific thalamic regions (especially the intralaminar nuclei) act as a "gateway" to awareness by synchronizing with the prefrontal cortex. This research challenges traditional cortex-focused views of consciousness and highlights the thalamus as a crucial component in conscious perception.
The synthetic thalamus implemented here attempts to computationally model some aspects of this biological gating mechanism, particularly the synchronization patterns (implemented as phase tags) that appear to be critical for information to reach conscious awareness.
Reference:
"Human high-order thalamic nuclei gate conscious perception through the thalamofrontal loop"
by Zepeng Fang, Yuanyuan Dang, An'an Ping, Chenyu Wang, Qianchuan Zhao, Hulin Zhao, Xiaoli Li and Mingsha Zhang, 4 April 2025, Science.
This implementation was created by:
- angrysky56 (Ty) - Project lead and 5B40 concept
- Claude (Anthropic) - Implementation and code development
- ChatGPT (OpenAI) - Additional assistance and contributions
- Gemini (Google) - Additional assistance and contributions
The project draws on theories of cognitive neuroscience, particularly the role of the thalamus in attention and consciousness, as well as Global Workspace Theory which proposes that consciousness emerges from the global broadcasting of information across specialized brain modules.
MIT