Save module dependencies not in conda/pip #237

nishadsingh1 · 2017-06-29T02:08:51Z

Resolves #207

Seems to work when using prediction functions of both the following forms (where test_util.py is some module not in conda or pip that has a COEFFICIENT=<int> variable):

import test_util
func = lambda x : len(x) * test_util.COEFFICIENT

import test_util as tu
func = lambda x : len(x) * tu.COEFFICIENT

nishadsingh1 · 2017-06-29T02:09:41Z

containers/python/Dockerfile

@@ -3,7 +3,7 @@ FROM clipper/py-rpc:latest
 MAINTAINER Dan Crankshaw <dscrankshaw@gmail.com>


Building this no longer works, but I'm also not sure why this file is around in the first place.

This won't work because we can't reference files outside of the docker file's build context. The context directory (the one containing the docker file) must reference clipper_admin as a subdirectory.

Right. This file should be deleted.

AmplabJenkins · 2017-06-29T02:36:52Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/499/
Test FAILed.

AmplabJenkins · 2017-06-29T18:48:51Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/501/
Test FAILed.

AmplabJenkins · 2017-06-29T19:34:03Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/503/
Test FAILed.

AmplabJenkins · 2017-06-29T21:07:45Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/504/
Test FAILed.

AmplabJenkins · 2017-06-29T21:35:01Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/505/
Test FAILed.

nishadsingh1 · 2017-06-29T23:08:02Z

Looks like something's still going wrong with this PR; not read for review yet.

AmplabJenkins · 2017-06-29T23:41:16Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/506/
Test FAILed.

AmplabJenkins · 2017-07-06T22:46:17Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/520/
Test FAILed.

nishadsingh1 · 2017-07-06T22:53:25Z

jenkins test this please

AmplabJenkins · 2017-07-06T23:01:30Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/521/
Test FAILed.

AmplabJenkins · 2017-07-06T23:07:21Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/522/
Test FAILed.

AmplabJenkins · 2017-07-07T06:23:00Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/525/
Test FAILed.

AmplabJenkins · 2017-07-07T07:16:53Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/526/
Test PASSed.

AmplabJenkins · 2017-07-11T09:30:30Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/554/
Test PASSed.

AmplabJenkins · 2017-07-11T09:53:13Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/555/
Test PASSed.

AmplabJenkins · 2017-07-11T17:16:46Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/556/
Test PASSed.

AmplabJenkins · 2017-07-11T17:46:03Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/557/
Test PASSed.

nishadsingh1 · 2017-07-11T18:06:14Z

There's now a different mechanism for importing modules from packages (import util.mock_module) and importing modules directly by having them in your path. That's why the integration test suites import the two modules in different ways.

In clipper_manager_tests.py, there's only one test that queries the python-container, test_deployed_and_linked_predict_function_queried_successfully. This test now makes use of both local module importing mechanisms. There isn't a separate tests for querying a python-container that uses these because the bulk of the interesting work being done by the deploy_predict_func feature is package management.

By contrast, the old pyspark-container tests remain intact and a new one has been added to test the local module importing. The deploy_pyspark_model feature needs to coordinate loading a pyspark model into the container, and testing that behavior independent of package management might clarify where bugs exist if/when they pop up.

AmplabJenkins · 2017-07-11T18:06:35Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/558/
Test FAILed.

AmplabJenkins · 2017-07-11T18:06:40Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/559/
Test FAILed.

nishadsingh1 · 2017-07-11T18:25:28Z

clipper_admin/clipper_manager.py

+                mda.ignore(module_name)
+        module_paths = mda.get_and_clear_paths()
+
+        def path_already_captured(path):


Let's say your predict function makes use of some package-based imported module: import my_package.my_module. Right now, we add the path to .../my_package/ to paths_to_copy and plan on recursively copying the whole folder to the container later. The location of my_package on the container would be: /model/modules/my_package. These will be accessible for import because /model/modules will be added to container's sys.path in the container code.

We also add that module name, my_package.my_module, to our Module Dependency Analyzer to find all the files that the module is dependent on; it's possible that some don't exist in the folder .../my_package/ and we still want to copy them to the container. For each file with path .../<file_name>, its copy location on the container will be /model/modules/<file_name>, and these will be accessible to directly import because /model/modules will be added to container's sys.path in the container code.

In order to get the paths to all these files, we run module_paths = mda.get_and_clear_paths(), module_paths will certainly have files contained in my_package/.... To avoid copying multiple versions of these files, we check here if each is already captured (using path_already_captured), and don't add it to the set of lists we'll copy if it is.

This seems good in most cases because it prevents us from doing a bunch of unnecessary copies. However, there's a case (seems kind of unlikely) where it will cause an ImportError. Let's say that your prediction function makes use of two modules: one imported via package (import my_package.my_module), and one imported directly by adding it's enclosing folder to your path and running (import my_module_2). Let's say that my_module_2 resides under my_package/.

Now, we'll run into an ImportError in the container because it won't be able to import the module with name my_module_2; it expects the module file to be in it's path (probably at /model/modules/my_module_2.py, but we prevented it from getting copied there because we were already going to copy it to /model/modules/my_package/my_module_2.py.

I'm not sure how much we care about this case. If we don't think it's costly to have a bunch of unnecessary file copies, then I think we should remove the code responsible for avoiding them (lines 736-744 here). Do you have any thoughts?

We discussed this in person, but to re-iterate, let's keep the de-duplication logic (that seems useful).

Let's support the case when users import packages but not try to support when users add the enclosing folder to sys.path. That's sort of an anti-pattern in Python (despite the fact we make occasional use of it for good reasons within this repo) and it's okay to not support it.

AmplabJenkins · 2017-07-11T18:29:57Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/560/
Test PASSed.

nishadsingh1 · 2017-07-12T08:36:19Z

Forgot to ping you earlier @dcrankshaw -- this is now ready for review again.

dcrankshaw

LGTM

AmplabJenkins · 2017-07-13T04:58:07Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/566/
Test FAILed.

dcrankshaw · 2017-07-13T04:59:52Z

jenkins test this please

AmplabJenkins · 2017-07-13T05:35:54Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/569/
Test PASSed.

nishadsingh1 added component: clipper_admin status: in progress type: enhancement labels Jun 29, 2017

nishadsingh1 self-assigned this Jun 29, 2017

nishadsingh1 commented Jun 29, 2017

View reviewed changes

nishadsingh1 requested a review from dcrankshaw June 29, 2017 02:10

nishadsingh1 force-pushed the save_mod_deps branch from 3d234fd to 0b4320a Compare June 29, 2017 18:56

dcrankshaw removed component: clipper_admin labels Jun 29, 2017

nishadsingh1 force-pushed the save_mod_deps branch from 0b4320a to d047257 Compare June 29, 2017 20:40

nishadsingh1 force-pushed the save_mod_deps branch from d047257 to 859d479 Compare June 29, 2017 21:11

nishadsingh1 force-pushed the save_mod_deps branch from 859d479 to 46f8f63 Compare June 29, 2017 23:11

nishadsingh1 force-pushed the save_mod_deps branch from 46f8f63 to f6249ff Compare July 6, 2017 22:36

nishadsingh1 force-pushed the save_mod_deps branch from f6249ff to 3fb0625 Compare July 6, 2017 22:58

nishadsingh1 force-pushed the save_mod_deps branch from 1eb3fc3 to 1adfa6d Compare July 7, 2017 08:11

nishadsingh1 added the status: needs review label Jul 7, 2017

nishadsingh1 added status: needs review and removed status: needs revision labels Jul 11, 2017

nishadsingh1 force-pushed the save_mod_deps branch from 2c26f76 to cd5e7ee Compare July 11, 2017 16:51

nishadsingh1 force-pushed the save_mod_deps branch 2 times, most recently from 2ac6167 to 1357e17 Compare July 11, 2017 17:50

Nishad Singh added 6 commits July 11, 2017 10:52

Python function deployment supplies local modules

4875733

Cleanup

64d0a5f

Address comments + can now supply local package-based modules correctly

628C

1cb985b

Removed unnecessary import and updated test module documentation

f4c5cd1

Cleanup

a76af12

Pyspark container test for local module imports added

836fafe

nishadsingh1 force-pushed the save_mod_deps branch from 1357e17 to 836fafe Compare July 11, 2017 17:57

nishadsingh1 commented Jul 11, 2017

View reviewed changes

dcrankshaw approved these changes Jul 13, 2017

View reviewed changes

Merge branch 'develop' into save_mod_deps

2933c19

dcrankshaw added status: accepted and removed status: needs review labels Jul 13, 2017

dcrankshaw merged commit 5115239 into ucbrise:develop Jul 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Save module dependencies not in conda/pip #237

Save module dependencies not in conda/pip #237

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		@@ -3,7 +3,7 @@ FROM clipper/py-rpc:latest
		MAINTAINER Dan Crankshaw <dscrankshaw@gmail.com>

Save module dependencies not in conda/pip #237

Save module dependencies not in conda/pip #237

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!