8000 GitHub - antinomyra/XGBoost.jl: XGBoost Julia Package
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

antinomyra/XGBoost.jl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XGBoost.jl

eXtreme Gradient Boosting Package in Julia

Abstract

This package is a Julia interface of XGBoost, which is short for eXtreme gradient Gradient Boosting. It is an efficient and scalable implementation of gradient boosting framework.The package includes efficient linear model solver and tree learning algorithms. The library is parallelized using OpenMP, and it can be more than 10 times faster some of than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is also made to be extensible, so that users are also allowed to define their own objectives easily.

Features

  • Sparse feature format, it allows easy handling of missing values, and improve computation efficiency.
  • Advanced features, such as customized loss function, cross validation, see demo folder for walkthrough examples.

Installation

Pkg.add("XGBoost")

or

Pkg.clone("https://github.com/antinucleon/XGBoost.jl.git")
Pkg.build("XGBoost")

The XGBoost package also depends on the BinDeps

Minimal examples

To show how XGBoost works, here is an example of dataset Mushroom

  • Prepare Data

XGBoost support Julia Array, SparseMatrixCSC, libSVM format text and XGBoost binary file as input. Here is an example of Mushroom classification. This example will use the function readlibsvm in basic_walkthrough.jl. This function load libsvm format text into Julia dense matrix.

using XGBoost

train_X, train_Y = readlibsvm("data/agaricus.txt.train", (6513, 126))
test_X, test_Y = readlibsvm("data/agaricus.txt.test", (1611, 126))
  • Fit Model
num_round = 2
bst = xgboost(train_X, num_round, label=train_Y, eta=1, max_depth=2)

Predict

pred = predict(bst, test_X)
print("test-error=", sum((pred .> 0.5) .!= test_Y) / float(size(pred)[1]), "\n")

Cross-Validation

nfold=5
param = ["max_depth"=>2, "eta"=>1, "objective"=>"binary:logistic"]
metrics = ["auc"]
nfold_cv(train_X, num_round, nfold, label=train_Y, param=param, metrics=metrics)

Feature Walkthrough

Check demo

Model Parameter Setting

Check XGBoost Wiki

About

XGBoost Julia Package

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Julia 100.0%
0