Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs)

Paper Link: https://arxiv.org/pdf/2410.07507v1

Abstract: Decoding and expressing brain activity in a comprehensible form is a challenging frontier in AI. This paper presents \textit{Thought2Text}, which uses instruction-tuned Large Language Models (LLMs) fine-tuned with EEG data to achieve this goal. The approach involves three stages: (1) training an EEG encoder for visual feature extraction, (2) fine-tuning LLMs on image and text data, enabling multimodal description generation, and (3) further fine-tuning on EEG embeddings to generate text directly from EEG during inference. Experiments on a public EEG dataset collected for six subjects with image stimuli demonstrate the efficacy of multimodal LLMs (LLaMa-v3, Mistral-v0.3, Qwen2.5), validated using traditional language generation evaluation metrics, GPT-4 based assessments, and evaluations by human expert. This approach marks a significant advancement towards portable, low-cost "thoughts-to-text" technology with potential applications in both neuroscience and natural language processing (NLP).

Approach

Thought2Text implements a three stage training approach to fine tune LLMs and make them Visual EEG aware. A sneak peak can be had by looking at the following diagram (and for detailed explanation refer to Section 4 of the paper).

Sample outputs from Mistral-v0.3-Instruct

Data and Stage1 Pretrained Model:

Download the data from here and place it inside a newly created data directory. Note: We do not hold copytight on the data, except the text descriptions, the data is shared only for reproducibility and only for academic research. If you have any questions about he original data, please contact the original authors (CITATION BELOW).

Once you have downloaded the data, install all dependencies through pip install -r requirements.txt

Training

Stage1: EEG Encoder alignmment with CLIP embeddings

python train_eeg_classifier.py --eeg_dataset data/block/eeg_55_95_std.pth --splits_path data/block/block_splits_by_image_all.pth --output ./eeg_encoder_55-95_40_classes --image_dir data/images/

The checkpoints for the encoder will be stored in ./eeg_encoder_55-95_40_classes.

Stage2 and Stage 3: Fine tuning Command

We use MistralV3 7B model as our example. Also /path/to/encoder should point to output/path/to/save/encoder from stage2.

python finetune_llm.py \
    --eeg_dataset data/block/eeg_55_95_std.pth \
    --splits_path data/block/block_splits_by_image_all.pth \
    --eeg_encoder_path ./eeg_encoder_55-95_40_classes \
    --image_dir data/images/ \
    --output "mistral_eeg_model" \
    --llm_backbone_name_or_path "mistralai/Mistral-7B-Instruct-v0.3" \
    --load_in_8bit \
    --bf16

Upon completion, the traine model will be available under mistral_eeg_model directory.

For model variants, in-subject and cross-subjectanalysis, refer to run.sh which captures all commands.

Inference:

For inference, run inference.py while pointing to the fine tuned model directory and path to eeg_55_95_std.pth and block_splits_by_image_all.pth. Sample commands with MistralV3 7B, assuming that the trained model is in mistral_eeg_model directory is given below:

python inference.py \
    --model_path "$model_path" \
    --eeg_dataset data/block/eeg_55_95_std.pth \
    --image_dir data/images/ \
    --dest "mistral_results.csv"

Evaluation:

We evaluate the model's generations through popular NLG metrics such as BLEU, METEOR and ROUGE. We also measure fluency and adequacy through GPT-4. The IPYNB notebooks can be found inside the eval folder.

Ethics Statement:

For this work, we utilized anonymized open-source EEG data, acknowledging the sensitivity of EEG data and the imperative of ethical compliance. All experimental data used in our research were anonymized to protect participant privacy and uphold ethical standards. Also, it is worth noting that all experiments have been carried out with NVIDIA TRX 4060ti (16GB VRAM) graphics card with torch==2.0.1+cu117, and any change in the configuration may change the results significantly. Moreover, LLM implementations running on GPUs exhibit inherent non-determinism, leading to slight deviations in results across separate runs. We have attempted to mitigate this by setting seeds, but due to the stochastic nature of GPU-based computations (e.g., floating-point arithmetic differences, CUDA optimizations, and parallelism in cuDNN/cuBLAS), exact reproducibility remains challenging. Given resource constraints, we could not conduct an exhaustive number of experiments to quantify this variability, but we hope this work proves valuable in bridging Generative AI and EEG technologies.

Special Notes:

The scripts *chance.* and *chance2.* represent ONLY_OBJ and ONLY_EEG settings in the paper.

Acknowledgement:

The EEG Encoder portion of our approach is based on the following paper and the Channelnet encoder code and EEG data is based on this repository. We sincerely thank tha authors for their novel contribution which has made this work possible.

S. Palazzo, C. Spampinato, I. Kavasidis, D. Giordano, J. Schmidt, M. Shah, Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, doi: 10.1109/TPAMI.2020.2995909

For any questions or concerns, contact Abhijit or Shreya. Pull requests and GitHub issues may not be entertained in time. If you use our work, please cite it.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
channelnet		channelnet
diagrams		diagrams
eval		eval
LICENSE		LICENSE
README.md		README.md
args.py		args.py
config.py		config.py
constants.py		constants.py
datautils.py		datautils.py
finetune_llm.py		finetune_llm.py
inference.py		inference.py
inference_chance.py		inference_chance.py
inference_chance2.py		inference_chance2.py
inference_only_eeg.py		inference_only_eeg.py
loss.py		loss.py
model.py		model.py
pretrain_data_processor.py		pretrain_data_processor.py
requirements.txt		requirements.txt
run.sh		run.sh
run_all_fine_tuning.sh		run_all_fine_tuning.sh
run_all_inference.sh		run_all_inference.sh
run_inference_chance.sh		run_inference_chance.sh
run_inference_chance2.sh		run_inference_chance2.sh
run_inference_only_eeg.sh		run_inference_only_eeg.sh
run_subject_wise_fine_tuning.sh		run_subject_wise_fine_tuning.sh
run_subject_wise_inference.sh		run_subject_wise_inference.sh
test_eeg_classifier.py		test_eeg_classifier.py
train_eeg_classifier.py		train_eeg_classifier.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs)

Approach

Sample outputs from Mistral-v0.3-Instruct

Data and Stage1 Pretrained Model:

Training

Stage1: EEG Encoder alignmment with CLIP embeddings

Stage2 and Stage 3: Fine tuning Command

Inference:

Evaluation:

Ethics Statement:

Special Notes:

Acknowledgement:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

abhijitmishra/Thought2Text

Folders and files

Latest commit

History

Repository files navigation

Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs)

Approach

Sample outputs from Mistral-v0.3-Instruct

Data and Stage1 Pretrained Model:

Training

Stage1: EEG Encoder alignmment with CLIP embeddings

Stage2 and Stage 3: Fine tuning Command

Inference:

Evaluation:

Ethics Statement:

Special Notes:

Acknowledgement:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages