8000 Tensorflow testing script cannot import numpy whereas it is installed · Issue #3569 · tensorflow/tensorflow · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tensorflow testing script cannot import numpy whereas it is installed #3569

New issue

< 8000 strong>Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jplu opened this issue Jul 29, 2016 · 22 comments
Closed

Tensorflow testing script cannot import numpy whereas it is installed #3569

jplu opened this issue Jul 29, 2016 · 22 comments
Assignees
Labels
stat:awaiting response Status - Awaiting response from author

Comments

@jplu
Copy link
jplu commented Jul 29, 2016

Hello,

I have the following error when I try to compile the pip package from the source code.

Environment info

Operating System: Ubuntu 14.04

Installed version of CUDA and cuDNN:
(please attach the output of ls -l /path/to/cuda/lib/libcud*):

ls -l /usr/local/cuda/lib64/libcud*
-rw-r--r-- 1 root root   322936 Aug 15  2015 /usr/local/cuda/lib64/libcudadevrt.a
lrwxrwxrwx 1 root root       16 Aug 15  2015 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.7.5
lrwxrwxrwx 1 root root       19 Aug 15  2015 /usr/local/cuda/lib64/libcudart.so.7.5 -> libcudart.so.7.5.18
-rwxr-xr-x 1 root root   383336 Aug 15  2015 /usr/local/cuda/lib64/libcudart.so.7.5.18
-rw-r--r-- 1 root root   720192 Aug 15  2015 /usr/local/cuda/lib64/libcudart_static.a
lrwxrwxrwx 1 root root       13 Feb  9 18:48 /usr/local/cuda/lib64/libcudnn.so -> libcudnn.so.4
lrwxrwxrwx 1 root root       17 Feb  9 18:48 /usr/local/cuda/lib64/libcudnn.so.4 -> libcudnn.so.4.0.7
-rwxrwxr-x 1 root root 61453024 Feb  8 23:12 /usr/local/cuda/lib64/libcudnn.so.4.0.7
-rw-rw-r-- 1 root root 62025862 Feb  8 23:12 /usr/local/cuda/lib64/libcudnn_static.a

If installed from sources, provide the commit hash: 5c44302

Steps to reproduce

  1. git clone https://github.com/tensorflow/tensorflow
  2. cd tensorflow
  3. ./configure
  4. bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
  5. bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
  6. bazel test -c opt --config=cuda //tensorflow/python:graph_util_test

Logs or other output that would be helpful

bazel test -c opt --config=cuda //tensorflow/python:graph_util_test
WARNING: Output base '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7' is on NFS. This may lead to surprising failures and undetermined behavior.
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/protobuf/WORKSPACE:1: Workspace name in /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/protobuf/WORKSPACE (@__main__) does not match the name given in the repository's definition (@protobuf); this will cause a build error in future versions.
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/re2/WORKSPACE:1: Workspace name in /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/re2/WORKSPACE (@__main__) does not match the name given in the repository's definition (@re2); this will cause a build error in future versions.
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/highwayhash/WORKSPACE:1: Workspace name in /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/highwayhash/WORKSPACE (@__main__) does not match the name given in the repository's definition (@highwayhash); this will cause a build error in future versions.
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/WORKSPACE:1: Workspace name in /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/WORKSPACE (@__main__) does not match the name given in the repository's definition (@gemmlowp); this will cause a build error in future versions.
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/bit_depth.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/gemmlowp.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/map.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/output_stages.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/profiling/instrumentation.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/profiling/profiler.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/plu/git/tensorflow/util/python/BUILD:11:16: in includes attribute of cc_library rule //util/python:python_headers: 'python_include' resolves to 'util/python/python_include' not in 'third_party'. This will be an error in the future.
INFO: Found 1 test target...
Slow read: a 160721045-byte read from /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/_pywrap_tensorflow.so took 13761ms.
FAIL: //tensorflow/python:graph_util_test (see /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/testlogs/tensorflow/python/graph_util_test/test.log).
Target //tensorflow/python:graph_util_test up-to-date:
  bazel-bin/tensorflow/python/graph_util_test
INFO: Elapsed time: 57.357s, Critical Path: 44.71s
//tensorflow/python:graph_util_test                                      FAILED in 0.5s
  /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/testlogs/tensorflow/python/graph_util_test/test.log

And here the content of the test.log file:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
-----------------------------------------------------------------------------
Traceback (most recent call last):
  File "/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/python/framework/graph_util_test.py", line 21, in <module>
    import tensorflow as tf
  File "/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/python/__init__.py", line 45, in <module>
    import numpy as np
ImportError: No module named numpy

But, I do have numpy installed:

python 
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy;numpy.version.version
'1.11.1'
>>>

I do not understand why I get this Python error. Any idea of what is going wrong?

Thanks in advance for any help!

@michaelisard
Copy link

@martinwicke would you take a look?

@martinwicke
Copy link
Member

In the configure step, did you specify the same python you're usually using? My guess would be that you're getting a different installation of python, without numpy installed.

@martinwicke martinwicke added the stat:awaiting response Status - Awaiting response from author label Aug 8, 2016
@jplu
Copy link
Author
jplu commented Aug 8, 2016

Oh this is a good guess! Unfortunately the one specified in the configure is the good one :-(

My own guess is that it is somehow related to the issue #2703, I think my $PYTHONPATH variable is not loaded in the test environment and then it cannot find all my libs. I will do some test to check that and keep you posted.

@jplu
Copy link
Author
jplu commented Aug 8, 2016

My guess was the good one, my Python environment is not loaded properly. I modified the file

local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/python/__init__.py

To add these two lines at the beginning:

import sys
print(sys.path)

And this is what I have in the log file:

['/h
8000
ome/plu/git/tensorflow/tensorflow/python/framework', '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles', '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/protobuf/python', '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow', '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/six_archive', '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/protobuf', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client']

These three paths are missing:

'/home/plu/git/deepdetect/clients/python', '/usr/local/python/lib/python2.7/site-packages', '/home/plu/git/caffe/python'

Which correspond to my $PYTHONPATH environment variable.

@martinwicke
Copy link
Member

@damienmg we have seen this before right? I couldn't find a bug open for bazel, and I forget whether this is intended behavior (it doesn't look like intended behavior to me).

@girving girving added stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed stat:awaiting response Status - Awaiting response from author labels Aug 8, 2016
@damienmg
Copy link
Contributor

@martinwicke this is an issue with the environment being stripped, @aehlig is working on a principle solution to it.

@martinwicke
Copy link
Member

@aehlig let me know when a fix is available in bazel and when that would hit a release so we can recommend a new minimum version.

@aehlig
Copy link
aehlig commented Sep 14, 2016

@aehlig let me know when a fix is available in bazel

@martinwicke: The "principle solution" @dmarting was talking about
is the environment variable design
https://www.bazel.io/designs/2016/06/21/environment.html

The first working implementation of the --action_env just hit the
source tree, bazelbuild/bazel@6f33a1c

and when that would hit a release so we can recommend a new minimum version.

It will be included in the 0.4 release, expected to be released in October.

Regards,
Klaus

Klaus Aehlig
Google Germany GmbH, Erika-Mann-Str. 33, 80636 Muenchen
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle

@martinwicke
Copy link
Member

Ok. We'll wait for 0.4, make that the minimum version and add the --action_env to the bazelrc.

@damienmg
Copy link
Contributor
damienmg commented Oct 4, 2016

FYI this flag is in 0.3.2 release candidate, so should be out in the next week

@martinwicke martinwicke added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Oct 26, 2016
@martinwicke
Copy link
Member

jplu@, can you try this again? We are now building with bazel 0.3.2, so this may work now, or at least we have the opportunity to make it work if it doesn't.

@jplu
Copy link
Author
jplu commented Oct 27, 2016

Thanks!! I will try today with bazel 0.3.2 and let you know if everything is now ok.

@jplu
Copy link
Author
jplu commented Oct 28, 2016

Hello,

I do have an issue to compile tensorflow with 0.3.2 now:

./configure
/home/plu/git/tensorflow /home/plu/git/tensorflow
Please specify the location of python. [Default is /usr/bin/python]: 
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
No Hadoop File System support will be enabled for TensorFlow
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /home/plu/git/caffe/python
  /usr/lib/python2.7/dist-packages
  /usr/local/python/lib/python2.7/site-packages
  /home/plu/git/deepdetect/clients/python
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]

/usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 
Please specify the location where CUDA  toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 
Please specify the location where cuDNN  library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
libcudnn.so resolves to libcudnn.4
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 
WARNING: Output base '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7' is on NFS. This may lead to surprising failures and undetermined behavior.
Extracting Bazel installation...
..............
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
ERROR: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/.nfs000000000000e93b00000021 (Device or resource busy).

I have this error every time I try to run configure even after doing rm -rf /homes/plu/.cache/bazel/_bazel_plu/*.

Is the problem is that my desktop machine is fully on a remote file system?

@martinwicke
Copy link
Member

I believe that may be the problem (see warning in the output). You should
check on the bazel issue tracker.

On Friday, October 28, 2016, Julien Plu notifications@github.com wrote:

Hello,

I do have an issue to compile tensorflow with 0.3.2 now:

./configure
/home/plu/git/tensorflow /home/plu/git/tensorflow
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
No Hadoop File System support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/home/plu/git/caffe/python
/usr/lib/python2.7/dist-packages
/usr/local/python/lib/python2.7/site-packages
/home/plu/git/deepdetect/clients/python
Please input the desired Python library path to use. Default is [/usr/loc 8000 al/lib/python2.7/dist-packages]

/usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
libcudnn.so resolves to libcudnn.4
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]:
WARNING: Output base '/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7' is on NFS. This may lead to surprising failures and undetermined behavior.
Extracting Bazel installation...
..............
INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.
ERROR: /homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/.nfs000000000000e93b00000021 (Device or resource busy).

I have this error every time I try to run configure even after doing rm
-rf /homes/plu/.cache/bazel/_bazel_plu/*.

Is the problem is that my desktop machine is fully on a remote file system?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3569 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAjO_dnM0JcaXbURYbh4YKcChczHPn0qks5q4a2kgaJpZM4JYL1m
.

@jplu
Copy link
Author
jplu commented Oct 31, 2016

Apparently it is indeed a known issue bazelbuild/bazel#1970 I gonna try the fix and let you know.

@jplu
Copy link
Author
jplu commented Oct 31, 2016

The fix in the linked issue solved the compilation error but I still have the same issue than at the beginning :(

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
-----------------------------------------------------------------------------
Traceback (most recent call last):
  File "/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/execroot/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/python/framework/graph_util_test.py", line 21, in <module>
    import tensorflow as tf
  File "/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/execroot/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/execroot/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/python/graph_util_test.runfiles/org_tensorflow/tensorflow/python/__init__.py", line 44, in <module>
    import numpy as np
ImportError: No module named numpy
/homes/plu/.cache/bazel/_bazel_plu/5fecb6612d2e95475ff53a54e377c3f7/execroot/tensorflow/bazel-out/local_linux-opt/testlogs/tensorflow/python/graph_util_test/test.log (END)

I think it is due to the configure, as I have selected only one Python library path, how is it possible to select multiple paths?

@aselle aselle removed the stat:awaiting response Status - Awaiting response from author label Nov 2, 2016
@martinwicke
Copy link
Member

This is odd -- we are using several paths from PYTHONPATH. Is PYTHONPATH different between running configure and at runtime?

@aehlig
Copy link
aehlig commented Nov 29, 2016 via email

@jplu
Copy link
Author
jplu commented Dec 2, 2016

@martinwicke What I can see is that my PYTHONPATH is missing when I do the same process than in my previous comment #3569 (comment)

@jplu
Copy link
Author
jplu commented Jan 5, 2017

Hello, any update on this issue?

@martinwicke
Copy link
Member

So the bazel fix you tried did not work?

@drpngx drpngx added the stat:awaiting response Status - Awaiting response from author label Jan 23, 2017
@drpngx
Copy link
Contributor
drpngx commented Jan 23, 2017

@jplu feel free to open a new issue if the problem persists.

@drpngx drpngx closed this as completed Jan 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Status - Awaiting response from author
Projects
None yet
Development

No branches or pull requests

8 participants
0