-
SNSF
- Switzerland
- okasag.github.io
Stars
Model implementation for the contextual embeddings project
Replication code and results for: A general framework to quantify the event importance in multi-event contests.
Efficient few-shot learning with Sentence Transformers
Code for measuring novelty in science using publication text
Code to replicate the simulation study in the paper "Causal Machine Learning for Moderation Analysis".
A reading list for papers on causality for natural language processing (NLP)
LlamaIndex is the leading framework for building LLM-powered agents over your data.
Code to replicate the simulation study and empirical application in the paper "Improving the Finite Sample Performance of Double/Debiased Machine Learning with Propensity Score Calibration"
Code for "Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification", arXiv 2024
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation …
Quarto document on using tidymodels and Databricks for predicting lending rates.
Tensorflow 2 implementation of Causal-BERT
This repository contains demos I made with the Transformers library by HuggingFace.
Fuzzy string matching, grouping, and evaluation.
A blazing fast inference solution for text embeddings models
🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
The multilingual language model for Switzerland
A curated list of pretrained sentence and word embedding models
Code to replicate the simulation study in the paper "Calibrating doubly-robust estimators with unbalanced treatment assignment"
Model Confidence Set (MCS) implementation in Python
Uncertainty Toolbox: a Python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization
General purpose unsupervised sentence representations
Python implementation of TextRank algorithm for automatic keyword extraction and summarization using Levenshtein distance as relation between text units. This project is based on the paper "TextRan…
A python tool for evaluating the quality of sentence embeddings.