8000 GitHub - sjb27182/Vae_Set: An extendable VAE framework designed with a focus on robustness. Built with Pytorch
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

sjb27182/Vae_Set

Repository files navigation

Vae-Set:   A collection of experimental VAEs built on a robust foundational model


Here is a latent structure of about 60 variables, found to exist within the human handwriting of the number 7.
This is a normally distributed feature.
See Vae4 (below)



Vae1:  

Vae1 is a foundational model. It uses a multivariate Gaussian as the latent distribution and either a continuous-bernoulli or gaussian reconstruction density, depending on the data type.

It is designed to be a very robust and explicit VAE model. This explicit coding style allows for significant access to the inner workings of the model. Thus, Vae1 can be quickly extended into novel experimental architectures. Below are some of the design choices that make this possible.


Design Choices:

Dictionary Passing:

Every component of Vae1 (encoder, sampling layer, etc) passes dictionary objects between each other. This allows extensions, wherein new objects need to be passed, to be easily appended.

Explicit Probability Densities:

The latent probability layers/likelihoods are written explicitly. This reveals the simple laws a variational auto encoder must abide by. We see that the heart of most VAE models, the family of multivariate gaussian densities, contains parameterized elements that can be added and scaled, while still being inside the span of possible multivariate-Gaussian densities. This is to say, the space of parameterized latent density functions is closed under addition and scalar multiplication. Indeed, parameterized multivariate gaussians describe a mathematical Vector space.

Disentanglement/Independent-Component-Analysis in this setting, are the consequences of an orthogonal vector basis. Naturally, this opens the door for many new theoretical modifications of the VAE. Another generalized vector space that has caused revolutionary developments in NLP is the Fourier space. Indeed, attention-based/transformer models use a frequency basis to encode latent signals of sequential data. There is a shared, mathematical core structure between transformers and VAEs. It seems direct hybrid models (as opposed to rather in-direct models like DallE) are possible.

NOTE: Although it is interesting to explicitly include the exponential-form of the probability densities to be used for the cost function, for an applied project one should always use the log-density! To see why this is, please see my R repository concerning log-density vs. exponential density cost functions at this repo.



Analysis specific models:

After training a Vae1, the model state is saved. Analysis-specific variants of Vae1 are then able to load in this saved state. The analysis-specific Vae1 variants are well equipped to traverse the latent density using many parameterized, and manual algorithms. This encapsulation allows different types of analysis-specific extensions to be written apart from the training of the model.



Vae4:   A Gaussian-Categorical-Joint Density Model

Data is often recognized as existing in some high dimensional manifold. This manifold represents the behavior of the underlying generator function, and thus ML architectures like invertible flow networks, can learn these associated hyper-dimensional smooth structures.
However in real life, there are often many "states" that the generator function of the observed data can be in. In the study of dynamical systems, these "states" are usually intuited as different parameterizations of the underlying generator function (usually due to an unseen bifurcation event). But that's neither here nor there, the point is these "states" exist in data.

To a geometer, these would be described as "discrete structures" in the manifold. To a statistician, the data would be described as multi-modal, or "clustered". At anyrate, Vae4 uses a categorical, multivariate gaussian joint-density as the latent probability distribution to learn this manifold without the use of any labels.

Example Application

Although applications of this model are innumerable, here is an example concerning anomaly detection. Suppose we obtain an anomalous observation from nature. (In reality, this can be either a set of data or a single observation.)


We want to know what this value should be categorized as. As can be seen there is very little overlap of the anomaly's discrete-signal and the 2-discrete-signal. Naturaly, Vae4 does not categorize the anomalous observation as a 2.
Vae4 categorizes the value as an element in the 5-cluster. To answer the question of why the anomalous data is categorized as a 5, we may investigate how it compares to a known-5 that is already an element of the 5-cluster.



On dimensions 49 and 39 we see that the anomalous-5 and the known-5 are very similar. Vae4 is saying that the anomalous-5, with respect to its "over-all width", as well as it's "average width of line" could have very probably been generated from the standard-5 parameterization distribution. This is in the context of hypothesis testing, naturally.



However, we see here what the particularities of the anomalous-5 actually are. Vae4 is saying that the "top horizontal line length" of the anomalous-5, has an abnormally large value (dimension 34). Vae4 is also saying that the anomalous-5 has a "lower-loop to upper-loop size ratio" that is abnormally in favor of the "upper-loop" (dimension 44).

Note: Vae2,3,4 are not in this repo at this time. Some of these experimental models may be publishable.

About

An extendable VAE framework designed with a focus on robustness. Built with Pytorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0