8000 Low GPU utilization during Random Forest predict · Issue #843 · h2oai/h2o4gpu · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Low GPU utilization during Random Forest predict #843
Open
@StargazerAlex

Description

@StargazerAlex

Environment (for bugs)

  • OS platform, distribution and version (e.g. Linux Ubuntu 16.04): Ubuntu 16.04.6 LTS
  • Installed from (source or binary): pip
  • Version: 0.4.0
  • Python version (optional): 3.6
  • CUDA/cuDNN version: 10.0
  • GPU model (optional): Nvidia T4
  • CPU model: Intel Xeon, 32 cores
  • RAM available: 200 GB

Description

I want to use the Random Forest Classifier for predictions on a large amount of data but the prediction phase takes oddly much time and shows very low GPU utilization. Here are the p 53FE arameters I used for training:

model = h2o4gpu.RandomForestClassifier(
    n_estimators = 100, criterion = "gini",
    max_depth = 8, min_samples_split = 2, min_samples_leaf = 1,
    min_weight_fraction_leaf = 0, max_features = "auto",
    max_leaf_nodes = None, min_impurity_decrease = 0,
    min_impurity_split = None, bootstrap = True, oob_score = False,
    n_jobs = -1, random_state = None, verbose = 0, warm_start = False,
    class_weight = None, subsample = 1, colsample_bytree = 1,
    num_parallel_tree = 1, tree_method = "gpu_hist", n_gpus = -1,
    predictor = "gpu_predictor", backend = "h2o4gpu")

model.fit(x_train, y_train)

The training works pretty well. It's comparably fast and constantly uses around 80% of the GPU (measured with nvidia-smi).

y_pred = model.predict(x_test)

The prediction however only utilizes 4% of the GPU for a fraction of the time it requires to do one iteration (across 10 samples) while it mostly seems to use the CPU with being constantly at 100% for one core. For a class size of 2 it takes around 0.4 seconds, for 10 classes it is 3.4 seconds. Running it on solely CPU-based scikit-learn is faster with only 0.1 seconds.

Is this a general problem of tree-based predictions or am I doing something wrong?

Thanks a lot in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0