Voronoi Graph Traversal

Project goal

VGT methodology is designed to alleviate the high computational cost of Voronoi diagram and Delaunay tessellation constructions in high dimensions. This can allow some new applications of geometric methods which utilize these structures on more complex data.

The project makes use of an idea of a graph traversal, or marching, over a Voronoi graph (i.e. 1-skeleton of a Voronoi diagram) without its explicit computation, while extracting required information about the dual of the visited vertices.

We demonstrate the first applications of this methodology in our KDD'20 submission
Voronoi Graph Traversal in High Dimensions with Applications to Topological Data Analysis and Piecewise Linear Interpolation.

Features

Currently planned features are also included in their preliminary order of appearance.

General interface for interactions with the Voronoi graph (VoronoiGraph.{cpp|h})
Delaunay k-skeleton extraction (approx_delaunay.cpp)
Piecewise-linear interpolation (interpolate.cpp)
Geometric density estimation
Power diagrams (i.e. weighted Voronoi diagrams)
Non-Euclidean/Riemannian metric spaces (e.g. hyperspherical)

Also, vgt/extra contains a few helper scripts for some experiments from the paper above.

Related projects

VoroCrust (https://vorocrust.sandia.gov/)
A way of obtaining a Delaunay graph approximation via an non-decreasing subgraph sequence in arbitrarily high dimensions without a full explicit construction was first introduced by Scott A. Mitchell, Mohamed S. Ebeida et al. in Spoke-Darts for High-Dimensional Blue-Noise Sampling. In the referenced work, graph approximations have their edges sampled according to their importance, i.e. angular sizes, while the VGT performs a close-to-uniform sampling of simplices, independent from the Voronoi diagram geometry.
Qhull (http://www.qhull.org/)
Qhull has always been the standard tool to compute Voronoi and Delaunay tessellations fast and explicitly in arbitrary dimensions. The exact computations of Qhull will likely be more advantageous when the dimensionality of data is small (e.g. up to 6 dimensions) compared to the VGT. The main profit of the VGT methodology comes in medium-dimensional data, when full reconstructions are no longer feasible or require excessive amount of time.

Technical details

Dependencies

C++17 (g++ compiler), cmake, make
zlib (required by cnpy, tested on version 1.2.11)
Eigen3 (linear algebra, tested on version 3.3.7)
(Recommended) OpenMP (enable parallel computations)

As an example, the following line should install all dependencies on a clean Ubuntu 20.04 system:

sudo apt install g++ make cmake zlib1g-dev libeigen3-dev libomp-dev

Already included and do not require installation:

cnpy to read numpy files in C++
argparse for command line argument parsing
tqdm for a progress line bar during the computations

Compiling

A "standard cmake routine" can be applied to compile the code from sources:

mkdir build && cd build
cmake ..
make

You can choose to pass additional preprocessor parameters to cmake to optimize your code with a string
-DCMAKE_CXX_FLAGS="-D<key>=<value>". The following keys are available:

DIMENSIONALITY If you have a fixed dimensionality of your data, then, following the suggestions from Eigen about fixed size matrices, you can compile the sources for that specific dimensionality to potentially speed up the computations. Default dimensionality used is Eigen::Dynamic.
FTYPE By default, all computations are performed in long double. However, if you want to use some other floating-point type you can pass the corresponding flag to cmake, for example -DFTYPE=double. Be prepared that by using double or float, the numeric instability may drastically increase in high-dimensional spaces. You can control it by looking at the number of failed validations after running an algorithm.
You may also opt to use multiple-precision arithmetics libraries, such as MPIR or GMP. Such libraries are not included in this package, so you would need to include them manually and update the line using ftype = long double; in vgt/utils.h with your preferred floating-point type. Using these libraries would allow one to achieve a desired numeric stability at the cost of a slower running time.

Executables

All executables have a -h command line argument to show an up-to-date information about the available command line arguments.

Delaunay skeleton approximation

vgt/approx_delaunay [options] data.npy

Argument	Description
`data.npy`	npy NxD data matrix of 32-bit floats
`--dim <int:2>`	maximum dimensionality of extracted Delaunay simplices
`--steps <int:1000>`	number of steps to perform in a random walk from each starting vertex; a non-positive value would instead correspond to a full walk / complete graph search
`--noclosure`	flag to output only k-dimensional Delaunay simplices without their closure
`--out <str:"output.txt">`	output filename for a txt file containing Delaunay simplices; each line describes a simplex in a form `d v_0 v_1 ... v_d [f]`, where `v_i` are indices of data points that form the simplex and `[f]` is the filtration value of the simplex, if it was computed
`--vout <str:"">`	optional output filename for a txt file containing visited Voronoi vertices; each line describes a vertex in a form of its dual `d v_0 v_1 ... v_d`
`--seed <int:239>`	random seed
`--njobs <int:1>`	number of parallel threads (requires OpenMP)
`--strategy <str:"brute_force">`	ray sampling strategy, available values: `{"brute_force", "bin_search"}`

Piecewise-linear interpolation

vgt/interpolate [options] data.npy query.npy

6334

Argument	Description
`data.npy`	numpy NxD data matrix of 32-bit floats
`query.npy`	numpy MxD data matrix of 32-bit floats
`--values <str:"">`	npy Nx1 matrix of 32-bit floats
`--out <str:"interpolated.npz">`	output filename for an npz data file, format described below
`--seed <int:239>`	random seed
`--njobs <int:1>`	number of parallel threads (requires OpenMP)
`--strategy <str:"brute_force">`	ray sampling strategy, available values: `{"brute_force", "bin_search"}`

The output npz file contains the following matrices:

indices ,shape=(L, ) indices of query points for which the interpolation was successful (points must lie inside the convex hull of the data)
simplices, shape=(L, D+1) each line contains (D+1) indices of data points that are vertices of a Delaunay simplex containing the corresponding query point
coefficients, shape=(L, D+1) contains barycentric coordinates of query points, based on data point indices from simplices
estimates, shape=(L,) (if values are provided) interpolation values of query points

BibTeX

@inproceedings{polianskii2020voronoi,
  title={Voronoi Graph Traversal in High Dimensions with Applications to Topological Data Analysis and Piecewise Linear Interpolation},
  author={Polianskii, Vladislav and Pokorny, Florian T},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={2154--2164},
  year={2020}
}

License

The project is available under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
include		include
libs		libs
vgt		vgt
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Voronoi Graph Traversal

Project goal

Features

Related projects

Technical details

Dependencies

Compiling

Executables

Delaunay skeleton approximation

Piecewise-linear interpolation

BibTeX

License

About

Uh oh!

Releases

Packages

Languages

License

vlpolyansky/vgt

Folders and files

Latest commit

History

Repository files navigation

Voronoi Graph Traversal

Project goal

Features

Related projects

Technical details

Dependencies

Compiling

Executables

Delaunay skeleton approximation

Piecewise-linear interpolation

BibTeX

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages