-
-
-
-
-
artificial-twitter-bot Public
Twitter bot that posts sentences generated with LSTM
Python UpdatedDec 8, 2022 -
-
biomedical Public
Forked from bigscience-workshop/biomedicalTools for curating biomedical training data for large-scale language modeling
Python UpdatedApr 22, 2022 -
10000
-
pubmed-parser Public
Library to download PubMed abstracts with metadata. Originally created to obtain the DrugProt (BioCreative VII) background set
-
drugprot-evaluation-library Public
DrugProt (BioCreative VII) evaluation library
-
document_selection_tfidf Public
Select documents from a target corpus based on their shared vocabulary with a source annotated corpus
-
-
mapping-prep-workflow Public
Script to prepare files for normalization annotators
-
NLP-progress Public
Forked from sebastianruder/NLP-progressRepository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Python MIT License UpdatedJun 9, 2021 -
-
-
utils_BSC Public
Different useful snippets I create while I am a working at BSC
-
-
-
-
code-lookup Public
Add codes to annotations based on Levenhstein distance
Python UpdatedJul 10, 2020 -
genetic-snake Public
Genetic algorithm to train a simple NN to play the classic snake game
-
-
-
Mesh2ICD10 Public
Filter PubMed results XML, get Mesh terms from it and map them to ICD10 categories
-
lda-intro Public
Text preparation, LDA (topic modelling) model creation and distance calculation based on topics
-
xgboost-classifier Public
Tune hyper-parameters of an XGBoost with a real dataset
-
tpot-intro Public
Quick introduction to TPOT using MNIST database
Jupyter Notebook UpdatedMay 5, 2019 -
feature-selection Public
Using Scikit and MLXTEND, we test different ML algorithms and see how they behave when trying with them several feature selection techniques.
-
pca-pyspark Public
Optimizing the number of Principal Components for dimensionality reduction in PySpark
Jupyter Notebook UpdatedApr 4, 2019