8000 GitHub - yiqichen-2000/BERTE: BERTE:High-precision hierarchical classification of transposable elements by a transfer learning method with BERT pre-trained model and convolutional neural network
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

BERTE:High-precision hierarchical classification of transposable elements by a transfer learning method with BERT pre-trained model and convolutional neural network

License

Notifications You must be signed in to change notification settings

yiqichen-2000/BERTE

Repository files navigation

BERTE

This repository includes the implementations of BERTE from:

BERTE: High-precision hierarchical classification of transposable elements based on transfer learning method with BERT and CNN

Introduction

Transposable Elements (TEs) are abundant repeat sequences found in living organisms. They play a pivotal role in biological evolution and gene regulation and are intimately linked to human diseases. Existing TE classification tools can classify classes, orders, and superfamilies concurrently, but they often struggle to effectively extract sequence features. This limitation frequently results in subpar classification results, especially in hierarchical classification. To tackle this problem, we introduced BERTE, a tool for TE hierarchical classification. BERTE encoded TE sequences into distinctive features that consisted of both attentional and cumulative k-mer frequency information. By leveraging the multi-head self-attention mechanism of the pre-trained BERT model, BERTE transformed sequences into attentional features. Additionally, we calculated multiple k-mer frequency vectors and concatenate them to form cumulative features. Following feature extraction, a parallel Convolutional Neural Network (CNN) model was employed as an efficient sequence classifier, capitalizing on its capability for high-dimensional feature transformation.

We evaluated BERTE’s performance on filtered datasets collected from 12 eukaryotic databases. Experimental results demonstrated that BERTE could improve the F1-score at different levels by up to 21% compared to current state-of-the-art methods. In general, BERTE classifies TE sequences with greater precision.

bioRxiv 2024

Paper

Method overview

Method_overview

TE hierarchical classification structure

TE_hierarchical_classification_structure

About

BERTE:High-precision hierarchical classification of transposable elements by a transfer learning method with BERT pre-trained model and convolutional neural network

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0