GradientBoost

This package is a massive simplification of the original GradientBoost.jl package. I have decided to simplify it down to bottom by removing all the goodies? Why, I just needed a bare boosting for experiments, when I want to boost a some algorithm (like NNs.) I also updated an algorithm to use / be compatible with MLUtils.jl, LossFunctions.jl, use Optim.jl and use Zygote as a fallback for custom loss functions (ForwardDiff might be better here.)

An example of use

The package is designed no sprinkle boosting on top of your ML algorithm. As such, it does not implement any algorithm to learn classifiers inside boosting. A simple example with decision stumpscan be found in test_ml.jl. A more sophisticated examples is in in example/mutagenesis.jl, where we show how to boost for classifying structured data. In the rest of this readme, the example with decision stumpts is commented.

Let's start by importing libraries and defining some training data.

using Test
using GradientBoost
using LossFunctions

x = Float64[
  1 0 -1  0;
  0 1  0 -1]
y = [-1, 1, -1, 1]

Our classifier that we want to boost is a decision tree of length 1, called Decision Stump. The decision stump is simple, implementing a variant of a rule xᵢ ≥ τ ? +1 : -1, where xᵢ is value of i-th feature. We define the decision stump as a simple callable struct.

struct Stump{T}
  dim::Int
  τ::T
  s::Int
end

function (s::Stump)(x::AbstractVector)
  x[s.dim] ≥ s.τ ? s.s : -s.s
end

function (s::Stump)(x::AbstractMatrix)
  vec(mapslices(s, x, dims = 1))
end

To use Stump as a learner inside boosting algorithm, we need to overload learner_fit and learner_predict functions. Using mutiple dispatch, we can specialize fitting of different loss functions and different learners. For the purpose of dispatch, we define StumpLearner to signal that we want to learn a Stump and overload the learner_fit as

struct StumpLearner end 

function GradientBoost.learner_fit(lf, learner::StumpLearner, x::AbstractMatrix, wy::Vector{<:Real})
  w = abs.(wy)
  y = sign.(wy)
  best_stump = Stump(1, mean(x[1,:]), 1)
  best_err = mean(w .* (y .!= best_stump(x)))
  for dim in axes(x,1)
    τs = 0.5(x[dim,2:end] + x[dim, 1:end-1])
    for τ in τs
      for s in [-1, +1]
        e = mean(w .* (y .!= Stump(dim, τ, s)(x)))
        if e < best_err 
          best_stump = Stump(dim, τ, s)
          best_err = e
        end
      end
    end
  end
  best_stump
end

Now, we define a function providing the prediction as

function GradientBoost.learner_predict(::Loss, ::StumpLearner, s::Stump, x)
  s(x)
end

Finally, the boosting is called as

gbl = GBBL(StumpLearner(); loss_function = ExpLoss, num_iterations=4, learning_rate=1, sampling_rate = 1)
model = fit(gbl, x, y)
predictions = GradientBoost.predict(model, x)
@asser 2(predictions .> 0) .- 1 == y

Some notes

I got rid of ML api, as it does not served my purpose.
The loss function has the signature loss(prediction, true_labels)
I would like to thank the author of the original GradientBoost.jl library. I just needed something supersimple.

References:

Friedman, Jerome H. "Greedy function approximation: a gradient boosting machine." Annals of Statistics (2001): 1189-1232.
Friedman, Jerome H. "Stochastic gradient boosting." Computational Statistics & Data Analysis 38.4 (2002): 367-378.
Hastie, Trevor, et al. The elements of statistical learning. Vol. 2. No. 1. New York: Springer, 2009.
Ridgeway, Greg. "Generalized Boosted Models: A guide to the gbm package." Update 1.1 (2007).
Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." The Journal of Machine Learning Research 12 (2011): 2825-2830.
Natekin, Alexey, and Alois Knoll. "Gradient boosting machines, a tutorial." Frontiers in neurorobotics 7 (2013).

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
doc/src		doc/src
example		example
src		src
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
FUTUREWORK.md		FUTUREWORK.md
LICENSE.md		LICENSE.md
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GradientBoost

An example of use

Some notes

About

Uh oh!

Releases

Packages

Languages

License

pevnak/GradientBoost.jl

Folders and files

Latest commit

History

Repository files navigation

GradientBoost

An example of use

Some notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages