-
Notifications
You must be signed in to change notification settings - Fork 69
Fix LISI error for small values #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. I like that you're only changing the input to the cpp part, and not the stored value. Is this tested?
@lazappi could you test this solves your problem? |
Trying to test it but running into another error which is probably unrelated: /Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/site-packages/anndata/_core/anndata.py:21: FutureWarning: pandas.core.index is deprecated and will be removed in a future version. The public classes are available in the top-level namespace.
from pandas.core.index import RangeIndex
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Traceback (most recent call last):
File "scripts/metrics.py", line 248, in <module>
trajectory_=trajectory_
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1888, in metrics
multiprocessing = True, verbose=verbose)
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1504, in lisi_graph
multiprocessing = multiprocessing, nodes = nodes, verbose=verbose)
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1390, in lisi_graph_py
n_cpu = multiprocessing.cpu_count()
AttributeError: 'bool' object has no attribute 'cpu_count' No error if I do it manually: >>> import multiprocessing
>>> multiprocessing.cpu_count()
12 I'll see if I can sort it out unless there are any suggestions. Side note. The number of cores isn't passed from Snakemake so the metrics uses all of them which is a pet hate. |
This might be due to refactoring of imports done by @mumichae yesterday. |
Phew, sometimes I wonder how the code has ever run properly. The issue is unrelated, actually. It's rather bad coding style, because I |
did you change the variable name |
Ok, I'm glad I hadn't completely messed something up. After fixing the multiprocessing thing I still get the original error: Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
malformed matrix line 83 2661 1.700000000000000e-308
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1263, in compute_simpson_index_graph
if stat(input_path + '_indices_'+ str(chunk_no) + '.txt').st_size == 0:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lisi_tmp1602242516/_indices_9.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "scripts/metrics.py", line 248, in <module>
trajectory_=trajectory_
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1889, in metrics
multiprocessing = True, verbose=verbose)
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1505, in lisi_graph
multiprocessing = multiprocessing, nodes = nodes, verbose=verbose)
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1438, in lisi_graph_py
count))
File "/Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/multiprocessing/pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lisi_tmp1602242516/_indices_9.txt' The value is exactly |
Reasonable. Changed it to |
Still no luck 😿 Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
Trying to set attribute `.obs` of view, copying.
malformed matrix line 83 2661 2.000000000000000e-308
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/Users/luke.zappia/miniconda3/envs/scIB-python/lib/python3.7/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/Users/luke.zappia/Documents/Code/GitHub-theislab/scib/scIB/metrics.py", line 1263, in compute_simpson_index_graph
if stat(input_path + '_indices_'+ str(chunk_no) + '.txt').st_size == 0:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/lisi_tmp1602244590/_indices_7.txt'
""" |
I got this working with setting small values to 3e-308. I also added a print message about values being replaced. I'm wondering if we should also change the threshold for small values to 3e-308 instead of 1.7e-308? |
Awesome! Thank you for taking the time and looking into this!
Yes, that would be way more consistent. |
Is this ready to be merged? |
Yes, I think so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to 0 connectivity values in this replacement?
I replace some values in the connectivities matrix by the minimum value allowed as double in C++ (i.e.
1.7e-308
). Code is tested on two examples, however, I was unable to reproduce Luke's error message, when I set some values in the connectivities matrix to values smaller than1.7e-308
.