8000 GitHub - hashlag/neighbours: Weighted kNN classifier and Nadaraya-Watson kernel regressor implementation based on random projection forest
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Weighted kNN classifier and Nadaraya-Watson kernel regressor implementation based on random projection forest

License

Notifications You must be signed in to change notification settings

hashlag/neighbours

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neighbours

Weighted kNN classifier and Nadaraya-Watson kernel regressor implementation based on random projection forest.

Classifier description

This project implements weighted kNN classifier with kernel smoothing.

Built-in smoothing kernels:

  • rectangular
  • gaussian
  • epanechnikov

Built-in distance metrics:

  • euclidean
  • manhattan

Custom distance metrics and smoothing kernels are supported.

Demo: demo/classifier_demo.py

Regressor description

Nadaraya-Watson kernel regressor is implemented.

Demo: demo/regressor_demo.py

regressor demo screenshot

Neighbors search algorithm

Random projection forest is used to search for neighbours.

RP trees are built by recursive binary splits of space by selected hyperplanes.

On each step algorithm chooses two random objects from the train set and calculates a hyperplane symmetrically separating these two objects in feature space. Such hyperplanes become splitting nodes of an RP tree.

Each resulting subset is split again if it contains more than m (hyperparameter) objects.

Hyperplane in each splitting node is represented with an equation of the form w · x + b = 0, where w is a vector normal to the hyperplane and b is an offset.

For simplicity, we assume what points may be located either "to the right" or "to the left" of the hyperplane.

We also define points belonging to a hyperplane as located "to the left."

So for any point x we can clearly determine its location:

w · x + b > 0 point x is located "to the right"
w · x + b ≤ 0 point x is located "to the left"

Since nodes are organized into a tree, we can perform search by evaluating the expression and proceeding to the corresponding child node.

Installation

pip install git+https://github.com/hashlag/neighbours.git

Usage

Import the package, for example, as

import neighbours as ns

Now you can use the classifier

import numpy as np

# KNNClassifier(features, classes_count, trees_count, maximum number of samples in one leaf of an RP tree)
classifier = ns.KNNClassifier(2, 3, 10, 7)

train = np.array([[2, 1], [10, 15], [1, 3] ...])
class_labels = np.array([0, 1, 0 ...])

# load samples into classifier and build an RP forest
classifier.load(train, class_labels)

# target object representation
sample = np.array([1, 1])

# specify distance metric, smoothing kernel, window width and obtain a prediction
prediction = classifier.predict(sample, ns.distance.euclidean, ns.kernel.gaussian, 1)

print(prediction)

Dependencies

The only third-party dependency is numpy.

License

This project is licensed under the MIT License.

About

Weighted kNN classifier and Nadaraya-Watson kernel regressor implementation based on random projection forest

Topics

Resources

License

Stars

Watchers

Forks

Languages

0