-
UCSB
- Waterloo, Canada
- https://wangxinyilinda.github.io/
- @XinyiWang98
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
Function Vectors in Large Language Models (ICLR 2024)
Stanford NLP Python library for understanding and improving PyTorch models via interventions
Scalable RL solution for advanced reasoning of language models
A reading list for papers on causality for natural language processing (NLP)
Interview questions for Computer Science faculty jobs
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
A one-stop library to standardize the inference and evaluation of all the conditional image generation models. (ICLR 2024)
Code for ACL2023 paper: Pre-Training to Learn in Context
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".
A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Retrieval and Retrieval-augmented LLMs
Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models