Open
Description
What is the problem?
import ray
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.datasets import make_classification
import numpy as np
X, y = make_classification(
n_samples=11000,
n_features=1000,
n_informative=50,
n_redundant=0,
n_classes=10,
class_sep=2.5)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=1000)
# Example parameters to tune from SGDClassifier
parameter_grid = {"alpha": [1e-4, 1e-1, 1], "epsilon": [0.01, 0.1]}
from sklearn.model_selection import GridSearchCV
# n_jobs=-1 enables use of all cores like Tune does
sklearn_search = GridSearchCV(SGDClassifier(), parameter_grid, n_jobs=4, cv=4)
sklearn_search.fit(x_train, y_train)
Introduces:
--------------------------------------------------------------------------------
LokyProcess-10 failed with traceback:
--------------------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/rliaw/miniconda3/envs/test/lib/python3.7/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 197, in <module>
prep_data = pickle.load(from_parent)
ValueError: unsupported pickle protocol: 5
--------------------------------------------------------------------------------
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The exit codes of the workers are {EXIT(1)}
This is because it uses Joblib
, a library for parallel computation. Joblib is for multiprocessing, so has a IPC where they use cloudpickle on the client side to transfer data to child processes. I think these clients access some ray cloudpickle attributes on accident, causing deserialization to break down.
Can we avoid the ray cloudpickle port from affecting 3rdparty libs?
cc @suquark
if sys.platform != "win32":
from ._posix_reduction import _mk_inheritable # noqa: F401
else:
from . import _win_reduction # noqa: F401
# global variable to change the pickler behavior
try:
from joblib.externals import cloudpickle # noqa: F401
DEFAULT_ENV = "cloudpickle"
except ImportError:
# If cloudpickle is not present, fallback to pickle
DEFAULT_ENV = "pickle"
ENV_LOKY_PICKLER = os.environ.get("LOKY_PICKLER", DEFAULT_ENV)
_LokyPickler = None
_loky_pickler_name = None