Stars
Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
Revisiting Pretrarining Objectives for Tabular Deep Learning
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and mo…
🎓 Um caminho para a educação autodidata em Ciência da Computação!
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Example repo to kickstart integration with mlflow pipelines.
JupyterLite demo deployed to GitHub Pages 🚀
The website for the CMU Language Technologies Institute low resource NLP bootcamp 2020
Dict2vec is a framework to learn word embeddings using lexical dictionaries.
My PhD thesis with all its source files, including all .tex files and images created, as well as the slides of my defense.
📖 A curated list of resources dedicated to Natural Language Processing (NLP)
Benchmarks of approximate nearest neighbor libraries in Python
Curated repository of notes from papers I'm reading, mostly NLP related. Updated regularly.
Compute Sentence Embeddings Fast!
sentence embedding by Smooth Inverse Frequency weighting scheme
A collection of modern/faster/saner alternatives to common unix commands.
A pure python implementation of the Word Mover‘s Embedding Algorithm
WordMoversEmbeddings(WME) is a simple code for generating the vector representation of sentence/document for text classification and clustering.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Best Practices on Recommendation Systems
Roadmap to becoming a data engineer in 2021
Self-Supervised Euphemism Detection and Identification for Content Moderation, IEEE S&P (Oakland) 2021
PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper
This Universal Dependencies (UD) Portuguese treebank.
SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks