Density EstimatioN using Masked AutoRegressive Flow
This package provides a scikit-learn
-like interface to perform density estimation using masked autoregressive flow. The current torch
-based implementation uses pytorch-flow as the backbone. A more performant re-implementation in jax
is in progress.
- scipy-stack (numpy, scipy, matplotlib, pandas)
- pytorch
- CUDA (for GPU capability)
pip install denmarf
The interface is very similar to the KernelDensity module in scikit-learn. To perform a density estimation, one first initialize a DensityEstimate
object from denmarf.density
. Then one fit the data, which is a numpy ndarray of size (n_samples, n_features)), X
with the method DensityEstimate.fit(X)
. Once a model is trained, it can be used to generate new samples using DensityEstimate.sample()
, or to evaluate the density at arbitrary point with DensityEstimate.score_samples()
To initialize a DensityEstimate
model, one can simply use
from denmarf import DensityEstimate
de = DensityEstimate()
Note that by default the model will try to use GPU whenever CUDA is available, and revert back to CPU if not available. To by-pass this behavior and use CPU even when GPU is available, use
from denmarf import DensityEstimate
de = DensityEstimate(device="cpu", use_cuda=False)
If multiple GPUs are available, one can specify which device to use by
from denmarf import DensityEstimate
de = DensityEstimate(device="cuda:2")
To faciliate the fitting performance for bounded distributions, logit transformations can be used to convert bounded distributions to unbound ones. denmarf
will automatically perform both the linear shifting and rescaling, as well as the actual logit transformation if the argument bounded
is set when initializing the model, and if the lower and upper bounds are given when calling .fit()
. When computing the probability density, the appropriate Jacobian is also computed.
For example,
from denmarf import DensityEstimate
# X is some np ndarray
de = DensityEstimate().fit(X, bounded=True, lower_bounds=..., upper_bounds=...)
After training a model, it can be saved (pickled) to disk for later usage. This can be done by using
de.save("filename_for_the_model.pkl")
denmarf
has built-in support for loading a trained model saved to disk and reconstructing the model either to CPU or GPU (does not have to be the same architecture where the training was performed!). For example, let us say we have a model trained on a GPU and we want to evaluate the model on a CPU instead. This can be done by using
from denmarf import DensityEstimate
de = DensityEstimate.from_file(filename="filename_for_the_model.pkl")
The default behavior is always loading the model to CPU.
Contribution is always welcome!
You can use the issue tracker on github to submit a bug report, to request a new feature, or simply to ask a question about the code!
If you would like to make changes to the code, just submit a pull request on github.