This package is a massive simplification of the original GradientBoost.jl package. I have decided to simplify it down to bottom by removing all the goodies? Why, I just needed a bare boosting for experiments, when I want to boost a some algorithm (like NNs.) I also updated an algorithm to use / be compatible with MLUtils.jl, LossFunctions.jl, use Optim.jl and use Zygote as a fallback for custom loss functions (ForwardDiff might be better here.)
The package is designed no sprinkle boosting on top of your ML algorithm. As such, it does not implement any algorithm to learn classifiers inside boosting. A simple example with decision stumpscan be found in test_ml.jl. A more sophisticated examples is in in example/mutagenesis.jl, where we show how to boost for classifying structured data. In the rest of this readme, the example with decision stumpts is commented.
Let's start by importing libraries and defining some training data.
using Test
using GradientBoost
using LossFunctions
x = Float64[
1 0 -1 0;
0 1 0 -1]
y = [-1, 1, -1, 1]
Our classifier that we want to boost is a decision tree of length 1, called Decision Stump. The decision stump is simple, implementing a variant of a rule
xᵢ ≥ τ ? +1 : -1
, where xᵢ
is value of i-th
feature. We define the decision stump as a simple callable struct
.
struct Stump{T}
dim::Int
τ::T
s::Int
end
function (s::Stump)(x::AbstractVector)
x[s.dim] ≥ s.τ ? s.s : -s.s
end
function (s::Stump)(x::AbstractMatrix)
vec(mapslices(s, x, dims = 1))
end
To use Stump
as a learner inside boosting algorithm, we need to overload learner_fit
and learner_predict
functions. Using mutiple dispatch, we can specialize fitting of different loss functions and different learners. For the purpose of dispatch, we define StumpLearner
to signal that we want to learn a Stump
and overload the learner_fit
as
struct StumpLearner end
function GradientBoost.learner_fit(lf, learner::StumpLearner, x::AbstractMatrix, wy::Vector{<:Real})
w = abs.(wy)
y = sign.(wy)
best_stump = Stump(1, mean(x[1,:]), 1)
best_err = mean(w .* (y .!= best_stump(x)))
for dim in axes(x,1)
τs = 0.5(x[dim,2:end] + x[dim, 1:end-1])
for τ in τs
for s in [-1, +1]
e = mean(w .* (y .!= Stump(dim, τ, s)(x)))
if e < best_err
best_stump = Stump(dim, τ, s)
best_err = e
end
end
end
end
best_stump
end
Now, we define a function providing the prediction as
function GradientBoost.learner_predict(::Loss, ::StumpLearner, s::Stump, x)
s(x)
end
Finally, the boosting is called as
gbl = GBBL(StumpLearner(); loss_function = ExpLoss, num_iterations=4, learning_rate=1, sampling_rate = 1)
model = fit(gbl, x, y)
predictions = GradientBoost.predict(model, x)
@asser 2(predictions .> 0) .- 1 == y
- I got rid of ML api, as it does not served my purpose.
- The loss function has the signature
loss(prediction, true_labels)
- I would like to thank the author of the original
GradientBoost.jl
library. I just needed something supersimple.
References:
- Friedman, Jerome H. "Greedy function approximation: a gradient boosting machine." Annals of Statistics (2001): 1189-1232.
- Friedman, Jerome H. "Stochastic gradient boosting." Computational Statistics & Data Analysis 38.4 (2002): 367-378.
- Hastie, Trevor, et al. The elements of statistical learning. Vol. 2. No. 1. New York: Springer, 2009.
- Ridgeway, Greg. "Generalized Boosted Models: A guide to the gbm package." Update 1.1 (2007).
- Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." The Journal of Machine Learning Research 12 (2011): 2825-2830.
- Natekin, Alexey, and Alois Knoll. "Gradient boosting machines, a tutorial." Frontiers in neurorobotics 7 (2013).