8000 GitHub - vlpolyansky/vgt: Voronoi Graph Traversal
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

vlpolyansky/vgt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voronoi Graph Traversal

Project goal

VGT methodology is designed to alleviate the high computational cost of Voronoi diagram and Delaunay tessellation constructions in high dimensions. This can allow some new applications of geometric methods which utilize these structures on more complex data.

The project makes use of an idea of a graph traversal, or marching, over a Voronoi graph (i.e. 1-skeleton of a Voronoi diagram) without its explicit computation, while extracting required information about the dual of the visited vertices.

We demonstrate the first applications of this methodology in our KDD'20 submission
Voronoi Graph Traversal in High Dimensions with Applications to Topological Data Analysis and Piecewise Linear Interpolation.

Features

Currently planned features are also included in their preliminary order of appearance.

  • General interface for interactions with the Voronoi graph (VoronoiGraph.{cpp|h})
  • Delaunay k-skeleton extraction (approx_delaunay.cpp)
  • Piecewise-linear interpolation (interpolate.cpp)
  • Geometric density estimation
  • Power diagrams (i.e. weighted Voronoi diagrams)
  • Non-Euclidean/Riemannian metric spaces (e.g. hyperspherical)

Also, vgt/extra contains a few helper scripts for some experiments from the paper above.

Related projects

  • VoroCrust (https://vorocrust.sandia.gov/)
    A way of obtaining a Delaunay graph approximation via an non-decreasing subgraph sequence in arbitrarily high dimensions without a full explicit construction was first introduced by Scott A. Mitchell, Mohamed S. Ebeida et al. in Spoke-Darts for High-Dimensional Blue-Noise Sampling. In the referenced work, graph approximations have their edges sampled according to their importance, i.e. angular sizes, while the VGT performs a close-to-uniform sampling of simplices, independent from the Voronoi diagram geometry.

  • Qhull (http://www.qhull.org/)
    Qhull has always been the standard tool to compute Voronoi and Delaunay tessellations fast and explicitly in arbitrary dimensions. The exact computations of Qhull will likely be more advantageous when the dimensionality of data is small (e.g. up to 6 dimensions) compared to the VGT. The main profit of the VGT methodology comes in medium-dimensional data, when full reconstructions are no longer feasible or require excessive amount of time.

Technical details

Dependencies

  • C++17 (g++ compiler), cmake, make
  • zlib (required by cnpy, tested on version 1.2.11)
  • Eigen3 (linear algebra, tested on version 3.3.7)
  • (Recommended) OpenMP (enable parallel computations)

As an example, the following line should install all dependencies on a clean Ubuntu 20.04 system:

sudo apt install g++ make cmake zlib1g-dev libeigen3-dev libomp-dev

Already included and do not require installation:

  • cnpy to read numpy files in C++
  • argparse for command line argument parsing
  • tqdm for a progress line bar during the computations

Compiling

A "standard cmake routine" can be applied to compile the code from sources:

mkdir build && cd build
cmake ..
make

You can choose to pass additional preprocessor parameters to cmake to optimize your code with a string
-DCMAKE_CXX_FLAGS="-D<key>=<value>". The following keys are available:

  • DIMENSIONALITY If you have a fixed dimensionality of your data, then, following the suggestions from Eigen about fixed size matrices, you can compile the sources for that specific dimensionality to potentially speed up the computations. Default dimensionality used is Eigen::Dynamic.
  • FTYPE By default, all computations are performed in long double. However, if you want to use some other floating-point type you can pass the corresponding flag to cmake, for example -DFTYPE=double. Be prepared that by using double or float, the numeric instability may drastically increase in high-dimensional spaces. You can control it by looking at the number of failed validations after running an algorithm.

    You may also opt to use multiple-precision arithmetics libraries, such as MPIR or GMP. Such libraries are not included in this package, so you would need to include them manually and update the line using ftype = long double; in vgt/utils.h with your preferred floating-point type. Using these libraries would allow one to achieve a desired numeric stability at the cost of a slower running time.

Executables

All executables have a -h command line argument to show an up-to-date information about the available command line arguments.

Delaunay skeleton approximation

vgt/approx_delaunay [options] data.npy
Argument Description
data.npy npy NxD data matrix of 32-bit floats
--dim <int:2> maximum dimensionality of extracted Delaunay simplices
--steps <int:1000> number of steps to perform in a random walk from each starting vertex;
a non-positive value would instead correspond to a full walk / complete graph search
--noclosure flag to output only k-dimensional Delaunay simplices without their closure
--out <str:"output.txt"> output filename for a txt file containing Delaunay simplices;
each line describes a simplex in a form d v_0 v_1 ... v_d [f], where
v_i are indices of data points that form the simplex and
[f] is the filtration value of the simplex, if it was computed
--vout <str:""> optional output filename for a txt file containing visited Voronoi vertices;
each line describes a vertex in a form of its dual d v_0 v_1 ... v_d
--seed <int:239> random seed
--njobs <int:1> number of parallel threads (requires OpenMP)
--strategy <str:"brute_force"> ray sampling strategy, available values: {"brute_force", "bin_search"}

Piecewise-linear interpolation

vgt/interpolate [options] data.npy query.npy
Argument Description
data.npy numpy NxD data matrix of 32-bit floats
query.npy numpy MxD data matrix of 32-bit floats
--values <str:""> npy Nx1 matrix of 32-bit floats
--out <str:"interpolated.npz"> output filename for an npz data file, format described below
--seed <int:239> random seed
--njobs <int:1> number of parallel threads (requires OpenMP)
--strategy <str:"brute_force"> ray sampling strategy, available values: {"brute_force", "bin_search"}

The output npz file contains the following matrices:

  • indices ,shape=(L, ) indices of query points for which the interpolation was successful (points must lie inside the convex hull of the data)
  • simplices, shape=(L, D+1) each line contains (D+1) indices of data points that are vertices of a Delaunay simplex containing the corresponding query point
  • coefficients, shape=(L, D+1) contains barycentric coordinates of query points, based on data point indices from simplices
  • estimates, shape=(L,) (if values are provided) interpolation values of query points

BibTeX

@inproceedings{polianskii2020voronoi,
  title={Voronoi Graph Traversal in High Dimensions with Applications to Topological Data Analysis and Piecewise Linear Interpolation},
  author={Polianskii, Vladislav and Pokorny, Florian T},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={2154--2164},
  year={2020}
}

License

The project is available under the MIT license.

About

Voronoi Graph Traversal

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0