- Hong Kong
Stars
A decoder-only llm-based generative recommendation framework that integrates endogenous and exogenous behavioral and semantic information in a non-intrusive manner
This is the repository for the No.500 submission for SIGIR 2025
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
KDD2025 | Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction
LongCTR: A Long Sequence Modeling Benchmark for CTR Prediction
Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.
A lightweight implementation of Beam Search for sequence models in PyTorch.
A PyTorch native platform for training generative AI models
A FuxiCTR Baseline for Multimodal CTR Prediction Challenge at WWW 2025
(SIGIR'24) Polynomial GF-based recommendation method
SVD-AE: Simple Autoencoders for Collaborative Filtering
[ NeurIPS '22 ] ∞-AE model's implementation in JAX. Kernel-only method outperforms complicated SoTA models with a closed-form solution and a single hyper-parameter.
[ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Learning from Negative samples for Biomedical Generative Entity Linking
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
[ICCV2023] ETran: Energy-based Transferability Estimation
Code release for "LogME: Practical Assessment of Pre-trained Models for Transfer Learning" (ICML 2021) and Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs (JMLR 2022)
Source code for EMNLP 2023 paper "Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions".
Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"
A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search
Extremely simple and fast extreme multi-class and multi-label classifiers.
Learnable Item Tokenization for Generative Recommendation (CIKM'24)
Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).