knockoffspy

An interface to Knockoffs.jl from the Python programming language. knockoffspy provides unique high performance methods for sampling various model-X knockoffs and ships with built-in routines for variable selection. Much of the functionality are unique and allow for orders of magnitude speedup over conventional methods. 'knockoffspy' attaches a Python interface onto the package, allowing seamless use of this tooling by Python users.

Installation

To install knockoffspy, use pip

pip install knockoffspy

Then in the python interpreter,

>>> import knockoffspy
>>> knockoffspy.install()

This will install the Knockoffs.jl package and all the Julia dependencies that it needs.

Usage

Import the package as

from knockoffspy import ko

The general flow for using the package is to follow exactly as would be done in Julia, except add ko. in front of function calls. Most of the commands will work without any modification. Thus the Knockoffs.jl documentation is the main in-depth documentation for this package. Below we will show how to translate these docs to Python code.

Documentation

Most of the commands of Knockoffs.jl will work in python without any modification, just add ko. in front of function calls. Thus the Knockoffs.jl documentation is the main in-depth documentation for this package. Below we will show how to translate these docs to Python code.

Example: Exact model-X group knockoffs

Lets simulate X ~ N(0, Sigma) where Sigma is a symmetric Toesplitz matrix. Here we assume every 5 variables form a group

from knockoffspy import ko
from scipy import linalg
import numpy as np

# generate data
n = 1000          # number of samples
p = 1000          # number of covariates
m = 1             # number of knockoffs to generate per feature
groups = np.repeat(np.arange(0,200,1), 5)
Sigma = linalg.toeplitz([0.7**i for i in range(1, p+1)])
mu = np.zeros(p)
X = np.random.multivariate_normal(mean=mu, cov=Sigma, size=(n,))

We generate model-X group knockoffs as follows

solver = "maxent" # Maximum entropy solver, other choices include "mvr", "sdp", "equi"
result = ko.modelX_gaussian_group_knockoffs(X, solver, groups, mu, Sigma, verbose=True)

Maxent initial obj = -2087.6929364666807
Iter 1 (PCA): obj = -1607.4371641119483, δ = 0.4806840533479427, t1 = 0.28, t2 = 0.46
Iter 2 (CCD): obj = -1589.951838589172, δ = 0.046537146748581976, t1 = 0.42, t2 = 1.28, t3 = 0.0
Iter 3 (PCA): obj = -1570.44910152802, δ = 0.32338244109034703, t1 = 0.67, t2 = 1.74
Iter 4 (CCD): obj = -1562.5471454507458, δ = 0.028462155141072386, t1 = 0.81, t2 = 2.56, t3 = 0.0
Iter 5 (PCA): obj = -1557.1393033537286, δ = 0.124560844581473, t1 = 1.04, t2 = 2.99
Iter 6 (CCD): obj = -1552.4489159484508, δ = 0.020754156607442897, t1 = 1.18, t2 = 3.81, t3 = 0.01
Iter 7 (PCA): obj = -1549.810615656943, δ = 0.07012194156368799, t1 = 1.43, t2 = 4.27
Iter 8 (CCD): obj = -1547.020766696055, δ = 0.015614065719368509, t1 = 1.56, t2 = 5.09, t3 = 0.01
Iter 9 (PCA): obj = -1545.531088575216, δ = 0.0511701859313534, t1 = 1.82, t2 = 5.58
Iter 10 (CCD): obj = -1543.8817717502163, δ = 0.013019011861975537, t1 = 1.95, t2 = 6.4, t3 = 0.01
Iter 11 (PCA): obj = -1542.9960966431295, δ = 0.04029486159842148, t1 = 2.23, t2 = 6.87
Iter 12 (CCD): obj = -1542.0192637205867, δ = 0.011386766045749418, t1 = 2.36, t2 = 7.69, t3 = 0.01
Iter 13 (PCA): obj = -1541.4898664708514, δ = 0.03310222247410438, t1 = 2.61, t2 = 8.17
Iter 14 (CCD): obj = -1540.9005704168408, δ = 0.010234115029284592, t1 = 2.74, t2 = 8.99, t3 = 0.01
Iter 15 (PCA): obj = -1540.5789961008802, δ = 0.027573434751233035, t1 = 3.5, t2 = 9.5
Iter 16 (CCD): obj = -1540.221696231743, δ = 0.009352305423352147, t1 = 3.62, t2 = 10.33, t3 = 0.01
Iter 17 (PCA): obj = -1540.0257796341718, δ = 0.02345771973192861, t1 = 4.15, t2 = 10.83
Iter 18 (CCD): obj = -1539.806801955503, δ = 0.008629474061316174, t1 = 4.28, t2 = 11.68, t3 = 0.02
Iter 19 (PCA): obj = -1539.689234490581, δ = 0.020162878516809316, t1 = 4.81, t2 = 12.21
Iter 20 (CCD): obj = -1539.5543495056934, δ = 0.007951906306609491, t1 = 4.93, t2 = 13.03, t3 = 0.02

To extract the knockoffs, S matrix, and the final objective as

Xko = result.Xko
S = result.S
obj = result.obj

Citation and reproducibility

If you use this software in a research paper, please cite our paper. Scripts to reproduce the results featured in our paper can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
knockoffspy		knockoffspy
.gitignore		.gitignore
LICENSE.md		LICENSE.md
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

knockoffspy

Installation

Usage

Documentation

Example: Exact model-X group knockoffs

Citation and reproducibility

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

biona001/knockoffspy

Folders and files

Latest commit

History

Repository files navigation

knockoffspy

Installation

Usage

Documentation

Example: Exact model-X group knockoffs

Citation and reproducibility

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages