8000 GitHub - XJTU-EEG/LibEER: LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition

License

Notifications You must be signed in to change notification settings

XJTU-EEG/LibEER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition

Paper Link:[2410.09767] LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition

Project Description

LibEER estabilshes a unified evaluation framework with standardized experimental settings, enabling unbiased evaluations of over representative deep learning-based EER models across the four most commonly used datasets.

  • Standardized Benchmark: LibEER provides a unified benchmark for fair comparisons in EER research, addressing inconsistencies in datasets, settings, and metrics, making it easier to evaluate various models.
  • Comprehensive Algorithm Library: The framework includes implementations of over ten deep learning models, covering a wide range of architectures (CNN, RNN, GNN, and Transformers), making it highly versatile for EEG analysis.
  • Efficient Preprocessing and Training: LibEER offers various preprocessing techniques and customizable settings, enabling efficient model fine-tuning, lowering the entry barrier for researchers, and boosting research efficiency.
  • Extensive Dataset Support: LibEER gives standardized access to major datasets like SEED, SEED-IV, DEAP, and MAHNOB-HCI, supporting both subject-dependent and cross-subject evaluations, with plans to add more datasets in the future.

Installation

To run this project, you'll need the following dependencies:

  1. Python 3.x recommended
  2. Dependencies: You can install the required Python packages by running:
pip install -r requirements.txt

pip

To install the LibEER by pip, please use the following command, all reproduced models have been integrated, easily for direct use. Please refer to chapter use model via pip for more information.

pip install LibEER

Experimental settings and comparison of replication results for reproduced models in LibEER.

MethodExperimental SettingsResults (Accuracy)
MethodDatasetPreprocessingTaskSplitting (Train : Test)EvaluationReported (%)Ours (%)Gap (%)
DGCNNSEEDB, R, DE, 1sdependent3 : 2ACC90.40±8.4989.48±8.490.92↓
RGNNSEEDB, R, DE, 1sdependent3 : 2ACC94.24±5.9584.66±10.749.58↓
EEGNetSEEDB, R, DE, 1sdependent3 : 2ACC——68.15±12.32——
DBNSEEDB, R, DE, 1sdependent3 : 2ACC86.91±7.6281.18±8.135.73↓
BiDANNSEEDB, R, DE, 9sdependent3 : 2ACC92.38±7.0489.06±9.423.32↓
R2G-STNNSEEDB, R, DE, 9sdependent3 : 2ACC93.38±5.9684.11±8.479.27↓
MS-MDASEEDB, DE, 1scross14 : 1ACC89.6393.974.34↑
GCBNetSEEDB, R, DE, 1sdependent3 : 2ACC92.30±7.4089.04±8.033.26↓
GCBNet_BLSSEEDB, R, DE, 1sdependent3 : 2ACC94.24±6.7088.80±9.545.44↓
CDCNSEEDB, R, DE, 1sdependent3 : 2ACC90.6385.10±8.805.53↓
CDCNDEAP-VB, R, 1sdependent9 : 1ACC92.2492.30±11.330.06↑
CDCNDEAP-AB, R, 1sdependent9 : 1ACC92.9291.99±12.200.93↓
ACRNNDEAP-VB, R, DE, 3sdependent9 : 1ACC93.7286.03±9.207.69↓
ACRNNDEAP-AB, R, DE, 3sdependent9 : 1ACC93.3888.31±7.775.07↓
HSLTDEAP-VB, R, 6scross14 : 1ACC, F166.5169.182.67↑
HSLTDEAP-AB, R, 6scross14 : 1ACC, F165.7568.813.06↑
HSLTDEAP-VAB, R, 6scross14 : 1ACC, F156.9349.577.36↓

LibEER Usage

LibEER implements three main modules: data loader, data splitting, and model training and evaluation. It also incorporates many representative algorithms in the field of EEG-based Emotion Recognition. The specific usage is detailed as follows. Additionally, to make it easier for users, we have implemented several one-step methods for common data processing and data splitting tasks. All reproduced models have corresponding main files named $MODEL_NAME$_train.py for reference.. For more details, please refer to the quick start of this chapter.

Dataset Preparation

LibEER supports the use of four EEG emotion recognition datasets. If you wish to conduct experiments on these datasets, please visit their respective official websites to apply for and download the datasets:

Quick Start

To facilitate easy use for users, we implemented the Setting class, allowing one-stop data usage through parameter configuration. Additionally, we have preconfigured many common experimental settings to help users quickly get started. Data is achieved through the Setting class: Guide about setting parameters

from models.Models import Model
from config.setting import Setting, preset_setting
from data_utils.load_data import get_data
from data_utils.split import merge_t
5DA8
o_part, index_to_data, get_split_index
from utils.args import get_args_parser
from utils.store import make_output_dir
from utils.utils import result_log, setup_seed
from Trainer.training import train
from models.DGCNN import NewSparseL2Regularization
import torch
import torch.optim as optim
import torch.nn as nn


def main(args):
    setting = Setting(dataset='deap',  # Select the dataset
                      dataset_path='DEAP/data_preprocessed_python',  # Specify the path to the corresponding dataset.
                      pass_band=[0.3, 50],  # use a band-pass filter with a range of 0.3 to 50 Hz,
                      extract_bands=[[0.5, 4], [4, 8], [8, 14], [14, 30], [30, 50]],
                      # Set the frequency bands for extracting frequency features.
                      time_window=1,  # Set the time window for feature extraction to 1 second.
                      overlap=0,  # The overlap length of the time window for feature extraction.
                      sample_length=1,
                      # Use a sliding window to extract the features with a window size of sample_length set to 1 and a step size of 1.
                      stride=1,
                      seed=2024,  # set up the random seed
                      feature_type='de_lds',  # set the feature type to extract
                      label_used=['valence'], # specify the label used
                      bounds=[5,5], # The bounds parameter is used to define the thresholds for high and low. Values below bounds[0] are considered negative samples, while values above bounds[1] are considered positive samples.
                      experiment_mode="subject-dependent",
                      split_type='train-val-test',
                      test_size=0.2,
                      val_size=0.2)
    setup_seed(2024) # Set the random seed for the experiment to ensure reproducibility.
    data, label, channels, feature_dim, num_classes = get_data(setting) # Get the corresponding data and information based on the setting class.
    # The organization of data and label is [session(1), subject(32), trial(40), sample(XXX)].
    data, label = merge_to_part(data, label, setting) # Merge the data based on the experiment task specified in the setting class.
    # After the merge_to_part() function and the specified subject-independent method, the organization of data and label will be [[subject(32), trail(40), sample(xxx)]].
    device = torch.device(args.device) # Set the device based on the args command-line parameters.
    best_metrics = [] # Prepare to record the experimental results.
    for rridx, (data_i, label_i) in enumerate(zip(data, label), 1): # This loop will only execute 32 times; it will be enabled only under the subject-dependent task.
        tts = get_split_index(data_i, label_i, setting) # Get the task indexes for the experiment based on the setting class. The leave-one-out splitting method was chosen.
        # Here, in tts:
        # train indexes:[2, 15, 4, 17, 5, 22, 39, 20, 23, 7, 18, 14, 35, 28, 12, 3, 33, 31, 36, 11, 32, 13, 9, 24], val indexes:[1, 19, 25, 16, 27, 29, 8, 6], test indexes:[0, 21, 26, 30, 10, 38, 37, 34]
        # train indexes:[0, 19, 1, 23, 8, 13, 10, 17, 18, 3, 11, 2, 24, 22, 29, 38, 26, 33, 28, 37, 34, 36, 5, 20], val indexes:[35, 39, 14, 15, 6, 21, 32, 4], test indexes:[25, 7, 16, 12, 27, 9, 31, 30]
        # ...
        for ridx, (train_indexes, test_indexes, val_indexes) in enumerate(zip(tts['train'], tts['test'], tts['val']), 1):
            setup_seed(args.seed) # Set the random seed again to ensure reproducibility.
            if val_indexes[0] == -1:
                print(f"train indexes:{train_indexes}, test indexes:{test_indexes}")
            else:
                print(f"train indexes:{train_indexes}, val indexes:{val_indexes}, test indexes:{test_indexes}")

            # Retrieve the corresponding data based on the indexes. train_data contains data from 24 trails, val_data contains data from 8 trails, and test_data contains data from other 8 trails.
            train_data, train_label, val_data, val_label, test_data, test_label = \
                index_to_data(data_i, label_i, train_indexes, test_indexes, val_indexes)
            # model to train
            if len(val_data) == 0:
                val_data = test_data
                val_label = test_label
            # Choose a model. Alternatively, you can use the method below to import the DGCNN model:
            # model = DGCNN(channels, feature_dim, num_classes)
            # You can configure the model parameters in model_param/DGCNN.yaml
            model = Model['DGCNN'](channels, feature_dim, num_classes)
            # Prepare the corresponding dataloader.
            dataset_train = torch.utils.data.TensorDataset(torch.Tensor(train_data), torch.Tensor(train_label))
            dataset_val = torch.utils.data.TensorDataset(torch.Tensor(val_data), torch.Tensor(val_label))
            dataset_test = torch.utils.data.TensorDataset(torch.Tensor(test_data), torch.Tensor(test_label))
            # Select an appropriate optimizer.
            optimizer = optim.AdamW(model.parameters(), lr=args.lr, weight_decay=1e-4, eps=1e-4)
            # Select appropriate loss functions. The first is a classification loss function, and the second is the L2 regularization loss in DGCNN.
            criterion = nn.CrossEntropyLoss()
            loss_func = NewSparseL2Regularization(0.01).to(device)
            # Specify the output_dir, mainly for saving intermediate results during model training. It is set based on args but may show errors currently.
            output_dir = make_output_dir(args, "DGCNN")
            # Call the training function to train. Batch size, epochs, etc., can be set via command-line parameters, or manually if desired.
            round_metric = train(model=model, dataset_train=dataset_train, dataset_val=dataset_val, dataset_test=dataset_test, device=device,
                                 output_dir=output_dir, metrics=args.metrics, metric_choose=args.metric_choose, optimizer=optimizer,
                                 batch_size=args.batch_size, epochs=args.epochs, criterion=criterion, loss_func=loss_func, loss_param=model)
            best_metrics.append(round_metric)
    # best metrics: every round metrics dict
    result_log(args, best_metrics)

if __name__ == '__main__':
    args = get_args_parser()
    args = args.parse_args()
    main(args)

Data also can be achieved by preset setting by:

from config.setting import deap_sub_dependent_train_val_test_setting

def main(args):
	setting = preset_setting["deap_sub_dependent_train_val_test_setting"](args)
	# ...
if __name__ == '__main__':
    args = get_args_parser()
    args = args.parse_args()
    main(args)

Currently supported prset setting can be found in Preset Setting in LibEER

Detailed usage

To enable users to have more precise control and use of intermediate results, this section presents the detailed usage of the three main modules. If the settings class does not meet the requirements of your experiment, you can refer to the usage methods below.

Data loader

In the data loader, LibEER supports four EEG emotion recognition datasets: SEED, SEED-IV, DEAP, and HCI. It also provides support for various data preprocessing methods and a range of feature extraction techniques. The following example demonstrates how to use LibEER to load a dataset and preprocess the data. Specifically, it extracts 1-second DE (Differential Entropy) features from the DEAP dataset, after baseline removal and band-pass filtering between 0.3-50Hz, across five frequency bands.

# get data, baseline, label, sample rate of data,  channels of data using get_uniform_data() function  
unified_data, baseline, label, sample_rate, channels = get_uniform_data(dataset="deap", dataset_path="DEAP/data_preprocessed_python")
# remove baseline  
data = baseline_removal(unified_data, baseline)  
# using a 0.3-50 Hz bandpass filter to process the data  
data = bandpass_filter(data, sample_rate,  pass_band=[0.3, 50])  
# a 1-second non-overlapping preprocess window to extract de_lds features on specified extract bands  
data = feature_extraction(data, sample_rate, extract_ban
AE20
ds=[[0.5,4],[4,8],[8,14],[14,30],[30,50]] , time_window=1, overlap=0, feature_type="de_lds") 
# sliding window with a size of 1 and  a step size of 1 to segment the samples.  
data, feature_dim = segment_data(data, sample_length=1, stride=1)
# data format: (session, subject, trail, sample)

Data Split

In LibEER, the Data Split module is mainly responsible for data partitioning under different experimental tasks and split settings. It supports three mainstream experimental tasks: subject-dependent, cross-subject, and cross-session, and offers various data splitting methods. The following example demonstrates how to split the dataset into training, validation, and testing sets in a subject-dependent task, with a ratio of 0.6, 0.2, and 0.2, respectively.

from data_utils.split import merge_to_part
data, label = merge_to_part(data, label, experiment_mode="subject_dependent") 
# further split each subject's subtask  
for idx, (data_i, label_i) in enumerate(zip(data,label)):  
    # according to the data format and label,  the test size is 0.2 and the validation size is 0.2   
    spi = get_split_index(data_i, label_i,  split_type="train-val-test", test_size=0.2, val_size=0.2)  
    for jdx, (train_indexes, test_indexes, val_indexes) in enumerate(zip(spi['train'],spi['test'], spi['val'])):  
        # organize the data according to the resulting index  
        (train_data, train_label, val_data, val_label,  test_data, test_label) = index_to_data(data_i, label_i,  train_indexes, test_indexes, val_indexes)

Model training and evaluation

LibEER supports various mainstream emotion recognition methods. For details, please refer to the Support Methods section. We selected DGCNN for training and testing.

from models.Models import Model
from Trainer.training import train
model = Model['DGCNN'](num_electrodes=channels, feature_dim=5,  num_classes=3, k=2, layers=[64], dropout_rate=0.5)  
# train and evaluate model, then output the metric  
round_metric = train(model,train_data,train_label,val_data,val_label,test_data,test_label)

Use the Model via pip

If you are only interested in the reproduced model, install the LibEER via pip and see the following instructions.

import LibEER.models.MsMda as MsMda
# use the training method provided by LibEER or yours
import LibEER.Trainer.msmdaTrain as train
model = MsMda(channels, feature_dim, num_classes, number_of_source=samples_source)
# result dicts
round_metric = train(model=model, datasets_train=datasets_train, dataset_val=dataset_val, dataset_test=dataset_test, output_dir=output_dir, samples_source=samples_source, device=device, metrics=args.metrics, metric_choose=args.metric_choose, optimizer=optimizer,

                                 batch_size=args.batch_size, epochs=args.epochs, criterion=criterion)

Results using our benchmark

The Mean Accuracies and F1 scores (and Standard Deviations) using the proposed benchmark for Subject-Dependent EER Experiment. The top two methods in each scenario are highlighted using bold and underlined formatting.

MethodSVMDNNTransCNNRNNGNN
MethodSVMDBNHSLTEEGNetCDCNTSceptionACRNNDGCNNRGNNGCBNetGCBNet_BLS
SEEDACC75.0871.8864.8358.8168.2364.0149.7182.5576.5580.5676.64
(19.73)(19.02)(20.47)(16.22)(20.35)(16.44)(13.15)(15.61)(16.92)(16.98)(17.44)
F170.8267.3958.8254.4163.7660.5345.7879.8972.5277.2972.52
(23.51)(22.81)(23.36)(17.59)(24.49)(18.51)(14.18)(18.93)(20.08)(20.92)(21.20)
SEED-IVACC47.8045.5640.2829.8952.2636.0629.0152.3945.4053.2853.51
(23.03)(21.19)(23.80)(13.53)(21.97)(15.12)(7.10)(24.32)(22.90)(21.05)(22.45)
F140.1737.6130.9226.5945.2632.7719.8045.9438.2446.2646.91
(21.68)(20.68)(24.47)(13.58)(23.00)(15.08)(5.42)(24.17)(23.09)(22.27)(22.46)
HCI-VACC64.8362.0364.0061.1560.4861.1260.5167.8364.8666.8469.60
(25.95)(24.90)(11.40)(16.76)(21.90)(15.52)(16.89)(22.40)(17.36)(21.42)(22.09)
F155.9952.8455.7750.3551.7350.5149.3954.7850.4154.6157.78
(27.80)(27.04)(13.12)(17.28)(23.11)(16.69)(15.70)(26.69)(20.34)(25.08)(27.18)
HCI-AACC63.6168.5167.7467.4271.8268.2666.2667.2970.9664.8969.60
(23.44)(21.72)(17.22)(21.71)(21.72)(23.10)(22.69)(27.73)(19.79)(27.12)(22.09)
F150.9957.1858.4254.5062.8956.2955.1758.0457.6657.4357.78
(24.89)(26.63)(19.50)(20.05)(24.99)(23.79)(23.20)(31.41)(25.39)(29.50)(27.18)
HCI-VAACC46.2944.3846.9938.3252.0040.0041.0053.0649.4649.1549.24
(28.42)(26.78)(20.76)(19.51)(26.05)(20.60)(21.58)(24.44)(23.20)(26.16)(27.97)
F133.9131.9634.7624.5638.0427.1927.1039.7535.9736.4936.68
(27.65)(27.87)(19.69)(13.71)(26.88)(13.65)(15.13)(26.01)(23.84)(27.58)(27.25)
DEAP-VACC54.1756.0856.2051.5057.7151.5253.5256.0755.9056.4957.02
(18.67)(17.38)(18.14)(11.57)(14.72)(9.54)(9.29)(17.15)(16.24)(18.17)(15.07)
F149.7348.6148.2147.8553.4147.3348.3149.0847.2550.3651.40
(18.92)(19.33)(18.73)(11.70)(15.50)(9.58)(7.77)(17.50)(17.55)(19.57)(17.25)
DEAP-AACC63.4964.6059.7461.3063.3757.4961.8362.6866.0965.9561.07
(16.72)(19.42)(18.82)(15.88)(14.18)(11.86)(14.32)(19.66)(13.91)(17.61)(16.56)
F153.3152.6150.1053.2653.9450.7549.6853.9449.2755.3450.43
(14.39)(19.85)(18.08)(13.05)(13.76)(11.30)(9.20)(20.10)(12.86)(17.86)(16.36)
DEAP-VAACC37.3239.5043.3439.4138.0835.6838.2041.8644.5338.8037.51
(17.24)(13.99)(14.49)(11.53)(15.52)(12.08)(11.34)(11.57)(14.35)(16.23)(15.29)
F125.5524.8823.4729.1928.9026.7521.0529.1225.8827.8927.00
(14.23)(10.79)(14.21)(10.21)(13.15)(8.30)(6.13)(10.92)(12.08)(15.94)(13.52)

The Mean Accuracies and F1 scores using the proposed benchmark for Cross-Subject EER Experiment. The top two methods in each scenario are highlighted using bold and underlined formatting.

27.84
MethodSVMDNNTransCNNRNNGNN
MethodSVMDBNMS-MDAHSLTEEGNetCDCNTSceptionACRNNDGCNNRGNNGCBNetGCBNet_BLS
SEEDACC37.0736.1664.0056.0038.1957.7245.6045.3960.8757.2056.3256.32
F133.4522.6757.3555.7531.8358.6643.5442.3757.2251.3755.1251.43
SEED-IVACC28.9836.8256.0730.3328.1931.0334.1931.9742.5444.1332.2740.54
F124.5632.6048.6811.6428.3527.0126.8318.8243.1043.3032.8942.73
HCI-VACC70.3369.2767.6466.9457.0667.6957.3654.5363.1965.8965.1671.06
F166.3665.5153.7062.2253.8362.6754.7652.5858.7544.3363.3261.83
HCI-AACC54.4157.5057.2349.4854.7055.9352.3051.2359.4257.2652.2160.85
F152.9156.2455.6847.4354.0255.1550.2549.4557.0256.8951.8456.26
HCI-VAACC31.5628.3044.9235.0734.8430.0926.9927.2141.5437.4343.0236.92
F126.2927.5821.6527.1927.9624.7121.9520.9238.0819.6036.0732.99
DEAP-VACC49.5855.0254.8256.5652.3657.7854.4451.9449.9152.1553.5852.34
F148.4953.1450.6056.5649.7457.7248.9447.3747.0944.8850.6849.03
DEAP-AACC51.4850.9926.4841.0848.9449.7345.9044.0949.9143.8650.0550.59
F150.9550.0925.3941.0548.9449.1745.5641.5147.0940.4647.7946.38
DEAP-VAACC24.5816.0617.3125.4123.9024.6420.8925.6619.0930.9220.49
F124.7024.3615.1916.8724.4422.5823.2415.1624.9513.0230.9818.41

Supported Dataset

Supported Methods

DNN methods

CNN methods

GNN methods

RNN methods

Transformer methods

Citations

@inproceeding{liu2024libeercomprehensivebenchmarkalgorithm,
      title={LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition}, 
      author={Huan Liu and Shusen Yang and Yuzhe Zhang and Mengze Wang and Fanyu Gong and Chengxi Xie and Guanjian Liu and Zejun Liu and Yong-Jin Liu and Bao-Liang Lu and Dalin Zhang},
      year={2024},
      eprint={2410.09767},
      archivePrefix={arXiv},
      primaryClass={cs.HC},
      url={https://arxiv.org/abs/2410.09767}, 
}
@article{Liu2024EEGBasedME, 
	title={EEG-Based Multimodal Emotion Recognition: A Machine Learning Perspective},
	author={Huan Liu and Tianyu Lou and Yuzhe Zhang and Yixiao Wu and Yang Xiao and Christian S. Jensen and Dalin Zhang}, 
	journal={IEEE Transactions on Instrumentation and Measurement}, 
	year={2024}, 
	volume={73}, 
	pages={1-29}, 
	url={https://api.semanticscholar.org/CorpusID:267978819} }

About

LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages

0