8000 "ValueError: generator already executing" when running dannce-predict with 3 cameras · Issue #62 · spoonsso/dannce · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

"ValueError: generator already executing" when running dannce-predict with 3 cameras #62

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xmikezheng20 opened this issue Aug 3, 2021 · 13 comments

Comments

@xmikezheng20
Copy link

Hi,
We're using DANNCE with a 3 camera setup. We'd like to use the pre-trained 6-camera network (weights.rat.AVG.6cam.hdf5) and finetune the model. According to the wiki, DANNCE should duplicate the 3 views to feed the 6 heads in the model. However, while dannce-train works well with the default n_rand_views, the dannce-predict runs into the following error:

    2021-08-03 16:45:36.202237: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
    io_config not found in io.yaml file, falling back to main config
    new_n_channels_out not found in io.yaml file, falling back to main config
    batch_size not found in io.yaml file, falling back to main config
    epochs not found in io.yaml file, falling back to main config
    net_type not found in io.yaml file, falling back to main config
    train_mode not found in io.yaml file, falling back to main config
    num_validation_per_exp not found in io.yaml file, falling back to main config
    vol_size not found in io.yaml file, falling back to main config
    nvox not found in io.yaml file, falling back to main config
    max_num_samples not found in io.yaml file, falling back to main config
    dannce_finetune_weights not found in io.yaml file, falling back to main config
    com_train_dir set to: .\COM\train_results\
    com_predict_dir set to: .\COM\predict_results\
    dannce_train_dir set to: .\DANNCE\train_results\
    dannce_predict_dir set to: .\DANNCE\predict_results\
    exp set to: [{'label3d_file': './20210803_142811_Label3D_dannce.mat'}]
    io_config set to: io.yaml
    new_n_channels_out set to: 16
    batch_size set to: 1
    epochs set to: 100
    net_type set to: AVG
    train_mode set to: finetune
    num_validation_per_exp set to: 4
    vol_size set to: 120
    nvox set to: 64
    max_num_samples set to: 100
    dannce_finetune_weights set to: .\DANNCE\weights\
    base_config set to: C:\Users\banerjeelab\Projects\dannce\configs\dannce_mouse_config.yaml
    viddir set to: videos
    crop_height set to: None
    crop_width set to: None
    camnames set to: None
    n_channels_out set to: 20
    sigma set to: 10
    verbose set to: 1
    net set to: None
    gpu_id set to: 0
    immode set to: vid
    mono set to: False
    mirror set to: False
    start_batch set to: 0
    start_sample set to: None
    com_fromlabels set to: False
    medfilt_window set to: None
    com_file set to: None
    new_last_kernel_size set to: [3, 3, 3]
    n_layers_locked set to: 2
    vmin set to: None
    vmax set to: None
    interp set to: nearest
    depth set to: False
    comthresh set to: 0
    weighted set to: False
    com_method set to: median
    cthresh set to: None
    channel_combo set to: None
    predict_mode set to: torch
    n_views set to: 6
    dannce_predict_model set to: None
    expval set to: None
    from_weights set to: None
    write_npy set to: None
    loss set to: mask_nan_keep_loss
    n_channels_in set to: None
    extension set to: None
    vid_dir_flag set to: None
    num_train_per_exp set to: None
    chunks set to: None
    lockfirst set to: None
    load_valid set to: None
    augment_hue set to: False
    augment_brightness set to: False
    augment_hue_val set to: 0.05
    augment_bright_val set to: 0.05
    augment_rotation_val set to: 5
    drop_landmark set to: None
    raw_im_h set to: None
    raw_im_w set to: None
    n_instances set to: 1
    use_npy set to: False
    data_split_seed set to: None
    valid_exp set to: None
    metric set to: ['euclidean_distance_3D']
    lr set to: 0.001
    rotate set to: True
    augment_continuous_rotation set to: False
    com_thresh set to: None
    cam3_train set to: None
    debug_volume_tifdir set to: None
    downfac set to: None
    dannce_predict_vol_tifdir set to: None
    n_rand_views set to: 0
    rand_view_replace set to: True
    multi_gpu_train set to: False
    heatmap_reg set to: False
    heatmap_reg_coeff set to: 0.01
    save_pred_targets set to: False
    Using the following *dannce.mat files: .\20210803_142811_Label3D_dannce.mat
    Setting vid_dir_flag to True.
    Setting extension to .mp4.
    Setting chunks to {'Camera0': array([0]), 'Camera1': array([0]), 'Camera2': array([0])}.
    Setting n_channels_in to 3.
    Setting raw_im_h to 1024.
    Setting raw_im_w to 1152.
    Setting expval to True.
    Setting net to finetune_AVG.
    Setting crop_height to [0, 1024].
    Setting crop_width to [0, 1152].
    Setting maxbatch to 100.
    Setting start_batch to 0.
    Setting vmin to -60.0.
    Setting vmax to 60.0.
    Using the following *dannce.mat files: .\20210803_142811_Label3D_dannce.mat
    Using torch predict mode
    Using camnames: ['Camera0', 'Camera1', 'Camera2']
    Experiment 0 using com3d: .\20210803_142811_Label3D_dannce.mat
    Removed 909 samples from the dataset because they either had COM positions over cthresh, or did not have matching sampleIDs in the COM file
    Saving 3D COM to .\DANNCE\predict_results\com3d_used.mat
    None
    2021-08-03 16:45:38.577213: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
    To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
    2021-08-03 16:45:38.583397: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x29558745c20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
    2021-08-03 16:45:38.583437: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
    2021-08-03 16:45:38.585008: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
    2021-08-03 16:45:38.608816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
    pciBusID: 0000:01:00.0 name: Quadro RTX 4000 computeCapability: 7.5
    coreClock: 1.545GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 387.49GiB/s
    2021-08-03 16:45:38.608947: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
    2021-08-03 16:45:38.609489: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
    2021-08-03 16:45:38.609579: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
    2021-08-03 16:45:38.609668: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
    2021-08-03 16:45:38.609769: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
    2021-08-03 16:45:38.609861: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
    2021-08-03 16:45:38.609963: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
    2021-08-03 16:45:38.610081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
    2021-08-03 16:45:39.020185: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
    2021-08-03 16:45:39.020271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0
    2021-08-03 16:45:39.020294: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N
    2021-08-03 16:45:39.021020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3686 MB memory) -> physical GPU (device: 0, name: Quadro RTX 4000, pci bus id: 0000:01:00.0, compute capability: 7.5)
    2021-08-03 16:45:39.023501: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x29503d12ec0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
    2021-08-03 16:45:39.023585: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Quadro RTX 4000, Compute Capability 7.5
    Init took 0.44759035110473633 sec.
    Initializing Network...
    Loading model from .\DANNCE\train_results\weights.97-21.50072.hdf5
    Predicting on batch 0
    c:\users\banerjeelab\projects\dannce\dannce\engine\generator.py:1221: UserWarning: Note: ignoring dimension mismatch in 3D labels
      warnings.warn(msg)
    Loading new video: videos\Camera1\0.mp4 for 0_Camera1
    Loading new video: videos\Camera0\0.mp4 for 0_Camera0
    Loading new video: videos\Camera1\0.mp4 for 0_Camera1
    Loading new video: videos\Camera2\0.mp4 for 0_Camera2
    Loading new video: videos\Camera0\0.mp4 for 0_Camera0
    Loading new video: videos\Camera2\0.mp4 for 0_Camera2
    Traceback (most recent call last):
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in <module>
        sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')())
      File "c:\users\banerjeelab\projects\dannce\dannce\cli.py", line 54, in dannce_predict_cli
        dannce_predict(params)
      File "c:\users\banerjeelab\projects\dannce\dannce\interface.py", line 1596, in dannce_predict
        n_chn,
      File "c:\users\banerjeelab\projects\dannce\dannce\engine\inference.py", line 696, in infer_dannce
        ims = generator.__getitem__(i)
      File "c:\users\banerjeelab\projects\dannce\dannce\engine\generator.py", line 966, in __getitem__
        X, y = self.__data_generation(list_IDs_temp)
      File "c:\users\banerjeelab\projects\dannce\dannce\engine\generator.py", line 1258, in __data_generation
        result = self.threadpool.starmap(self.project_grid, arglist)
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 276, in starmap
        return self._map_async(f
8000
unc, iterable, starmapstar, chunksize).get()
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 657, in get
        raise self._value
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 121, in worker
        result = (True, func(*args, **kwds))
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 47, in starmapstar
        return list(itertools.starmap(args[0], args[1]))
      File "c:\users\banerjeelab\projects\dannce\dannce\engine\generator.py", line 1028, in project_grid
        extension=self.extension,
      File "c:\users\banerjeelab\projects\dannce\dannce\engine\video.py", line 231, in load_vid_frame
        self.currvideo[camname].close() if self.predict_flag else \
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\lib\site-packages\imageio\core\format.py", line 259, in close
        self._close()
      File "C:\Users\banerjeelab\anaconda3\envs\dannce\lib\site-packages\imageio\plugins\ffmpeg.py", line 343, in _close
        self._read_gen.close()
    ValueError: generator already executing

Thank you in advance for your help!!

@spoonsso
Copy link
Owner
spoonsso commented Aug 5, 2021

It looks like it is complaining about i/o from the same videos across multiple parallel processes. I can look into it. But it is probably in any case easier to just try training/predicting from a pretrained 3-camera model. Here are some links to weights:

weights.rat.MAX.3cam: https://www.dropbox.com/s/2fama0q45sdzwfj/weights_multigpu.30-0.00002.hdf5?dl=0
weights.rat.AVG.3cam: https://www.dropbox.com/s/mb05fqmsbugvhf6/weights_multigpu.30-8.19468_singleGPU.hdf5?dl=0

I would try finetuning both -- you can make an AVG network starting from either weights -- and comparing performance. We have found recently that, for finetuning on mouse data, starting from pretrained MAX weights actually works better.

Please LMK how it goes!

@xmikezheng20
Copy link
Author

Thank you for sharing the weights of the pretrained 3-camera models! While I'm still aggregating training frames, with my current ~150 training frames, finetuning both networks already work pretty well.

@spoonsso
Copy link
Owner

Cool, you're welcome. Please let me know if you have any other questions.

@verpeutlab
Copy link
verpeutlab commented Aug 14, 2021

I substituted the weights with the ones linked above in this issue, and I was able to resolve the indexing problem. However, now I am getting the following error when I run DANNCE with three cameras.

ValueError: Error when checking input: expected input_3 to have shape (64, 64, 64, 18) but got array with shape (64, 64, 64, 9)

Since 18 is twice as large as 9, it seems that part of the program is still expecting six cameras and not three. Do you know of a way to address this issue? I am trying to run DANNCE with three cameras without retraining the program. Are there some additional weights or training files that I need so that DANNCE works effectively with three cameras?

@spoonsso
Copy link
Owner

Hi @verpeutlab. Likely you are either (1) omitting n_views from you config files or as a command line arg (n_views defaults to 6 if omitted) or (2) still have n_views: 6 in one of your configs.

The solution is to make sure you have n_views: 3, in your io.yaml.

If that doesn't work, please send all of the terminal text printed out before the error message.

@verpeutlab
Copy link
verpeutlab commented Aug 17, 2021

I still got the same error after I placed n_views: 3 in each .yaml file, and here is the text before the error message:

Initializing Network...
Loading model from .\DANNCE\train_results\AVG\weights.1200-12.77642.hdf5
max
25
Predicting on batch 0
Loading new video: videos\Camera1\0.mp4 for 0_Camera1
Loading new video: videos\Camera3\0.mp4 for 0_Camera3
Loading new video: videos\Camera2\0.mp4 for 0_Camera2
Traceback (most recent call last):
File "C:\Users\verpeutlab\anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in
sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')())
File "c:\users\verpeutlab\desktop\dannce\dannce\cli.py", line 46, in dannce_predict_cli
dannce_predict(params)
File "c:\users\verpeutlab\desktop\dannce\dannce\interface.py", line 1611, in dannce_predict
evaluate_ondemand(start_batch, max_eval_batch, valid_generator)
File "c:\users\verpeutlab\desktop\dannce\dannce\interface.py", line 1493, in evaluate_ondemand
pred = model.predict(ims[0])
File "C:\Users\verpeutlab\anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training_v1.py", line 992, in predict
use_multiprocessing=use_multiprocessing)
File "C:\Users\verpeutlab\anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training_arrays.py", line 707, in predict
x, check_steps=True, steps_name='steps', steps=steps)
File "C:\Users\verpeutlab\anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training_v1.py", line 2334, in _standardize_user_data
batch_size=batch_size)
File "C:\Users\verpeutlab\anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training_v1.py", line 2361, in _standardize_tensors
exception_prefix='input')
File "C:\Users\verpeutlab\anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training_utils.py", line 582, in standardize_input_data
str(data_shape))
ValueError: Error when checking input: expected input_3 to have shape (64, 64, 64, 18) but got array with shape (64, 64, 64, 9)

I have also included the parameters that are listed on the command window, which are printed right after the program begins to run:
com_train_dir set to: .\COM\train_results
com_predict_dir set to: .\COM\predict_results
com_file set to: .\COM\predict_results\com3d.mat
n_views set to: 3
dannce_train_dir set to: .\DANNCE\train_results\AVG
dannce_predict_dir set to: .\DANNCE\predict_results
dannce_predict_model set to: .\DANNCE\train_results\AVG\weights.1200-12.77642.hdf5
exp set to: [{'label3d_file': './label3d_demo.mat'}, {'label3d_file': '../markerless_mouse_2/label3d_demo.mat'}]
io_config set to: io.yaml
new_n_channels_out set to: 22
batch_size set to: 4
epochs set to: 1200
net_type set to: AVG train_mode set to: finetune
num_validation_per_exp set to: 4
vol_size set to: 120
nvox set to: 64
max_num_samples set to: max
dannce_finetune_weights set to: .\DANNCE\weights
base_config set to: ....\configs\dannce_mouse_config.yaml
viddir set to: videos
crop_height set to: None
crop_width set to: None
camnames set to: None
n_channels_out set to: 20
sigma set to: 10
verbose set to: 1
net set to: None
gpu_id set to: 0
immode set to: vid
mono set to: False
mirror set to: False
start_batch set to: 0
start_sample set to: None
com_fromlabels set to: False
medfilt_window set to: None
new_last_kernel_size set to: [3, 3, 3]
n_layers_locked set to: 2
vmin set to: None
vmax set to: None
interp set to: nearest
depth set to: False
comthresh set to: 0
weighted set to: False
com_method set to: median
cthresh set to: None
channel_combo set to: None
predict_mode set to: torch
expval set to: None
from_weights set to: None
loss set to: mask_nan_keep_loss
n_channels_in set to: None
extension set to: None
vid_dir_flag set to: None
chunks set to: None
lockfirst set to: None
load_valid set to: None
augment_hue set to: False
augment_brightness set to: False
augment_hue_val set to: 0.05
augment_bright_val set to: 0.05
augment_rotation_val set to: 5
drop_landmark set to: None
raw_im_h set to: None
raw_im_w set to: None
metric set to: ['euclidean_distance_3D']
lr set to: 0.001
rotate set to: True
augment_continuous_rotation set to: False
com_thresh set to: None
cam3_train set to: None
debug_volume_tifdir set to: None
downfac set to: None
dannce_predict_vol_tifdir set to: None
Using the following *dannce.mat files: .\label3d_dannce.mat
Setting vid_dir_flag to True.
Setting extension to .mp4.
Setting chunks to 3000.
Setting n_channels_in to 3.
Setting raw_im_h to 1024.
Setting raw_im_w to 1280.
Setting expval to True.
Setting net to finetune_AVG.
Setting crop_height to [0, 1024].
Setting crop_width to [0, 1280].
Setting maxbatch to max.
Setting start_batch to 0.
Setting vmin to -60.0.
Setting vmax to 60.0.
n views: 3
Using the following *dannce.mat files: .\label3d_dannce.mat
Using torch predict mode
Using camnames: ['Camera1', 'Camera2', 'Camera3']
Experiment 0 using com3d: .\COM\predict_results\com3d.mat

Below these paramaters, the following sentence was printed:
Removed 8900 samples from the dataset because they either had COM positions over cthresh, or did not have matching sampleIDs in the COM file
Do you think this has to do with the error I am getting, and what could potentially be causing it?

@spoonsso spoonsso reopened this Aug 17, 2021
@spoonsso
Copy link
Owner

It looks like your dannce_predict_model is not set to one of the 3 camera network weights (the file name does not match either of the ones I linked above.

@verpeutlab
Copy link

After working with a copy of DANNCE that I had pulled from this GitHub repository and have been modifying for a while now, I decided to stash my changes and re-pull DANNCE from the master branch.
I was able to successfully run markerless_mouse_1 with six cameras with this fresh copy of DANNCE. However, when I set n_views equal to 3 in io.yaml and the other config file and substituted the weights file in dannce_predict_model with eitherof the two weights you linked above, I got an error when running markerless_mouse_1. My error is copied below.

Traceback (most recent call last):
File "C:\Users\verpeutlab\anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in
sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')())
File "c:\users\verpeutlab\desktop\danncegood\dannce\cli.py", line 53, in dannce_predict_cli
params = build_clarg_params(args, dannce_net=True, prediction=True)
File "c:\users\verpeutlab\desktop\danncegood\dannce\cli.py", line 87, in build_clarg_params
params = infer_params(params, dannce_net, prediction)
File "c:\users\verpeutlab\desktop\danncegood\dannce\engine\processing.py", line 119, in infer_params
video_files = os.listdir(camdir)
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'videos\Camera4'

Could you tell me what is causing this error and how it can be fixed?

@verpeutlab
Copy link
verpeutlab commented Aug 24, 2021

I also wanted to let you know that I deleted the folders named Camera4, Camera5, and Camera6 in markerless_mouse_1. Therefore, I only have three cameras available to pass through DANNCE. I also changed camnames to ['Camera1' 'Camera2' 'Camera3'] in the config file dannce_rig_com_config.yaml

@Spartan859
Copy link

I suppose that your *_dannce.mat file is not changed to 3 cams. If you open it in matlab and navigate to cams, you will see that there's still 6 camnames, 6 syncs and 6 params for each camera. Try delete the last 3 of them.

@harshk95
Copy link

Hi,
I have the same error. I used the 6cam model and finetuned it. I use five camera setup so during the training I was prompted to duplicate one camera and I did so. However, when the same is done for the dannce-predict i get the following :

2021-08-24 08:09:17.477757: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
io_config not found in io.yaml file, falling back to main config
n_views not found in io.yaml file, falling back to main config
n_channels_out not found in io.yaml file, falling back to main config
batch_size not found in io.yaml file, falling back to main config
epochs not found in io.yaml file, falling back to main config
net_type not found in io.yaml file, falling back to main config
train_mode not found in io.yaml file, falling back to main config
num_validation_per_exp not found in io.yaml file, falling back to main config
vol_size not found in io.yaml file, falling back to main config
nvox not found in io.yaml file, falling back to main config
max_num_samples not found in io.yaml file, falling back to main config
dannce_finetune_weights not found in io.yaml file, falling back to main config
mono not found in io.yaml file, falling back to main config
com_train_dir set to: .\COM\train_results
com_predict_dir set to: .\COM\predict_results
com_file set to: E:\DANNCE_test_210608\COM\predict_results\com3d.mat
dannce_train_dir set to: .\DANNCE\train_results\AVG
dannce_predict_dir set to: .\DANNCE\predict_results
exp set to: [{'label3d_file': 'E:/DANNCE_test_210608/20210610_091000_Label3D_dannce.mat', 'com_file': 'E:/DANNCE_test_210608/COM/predict_results/com3d.mat'}]
io_config set to: io.yaml
n_views set to: 6
n_channels_out set to: 22
batch_size set to: 4
epochs set to: 1200
net_type set to: AVG
train_mode set to: finetune
num_validation_per_exp set to: 4
vol_size set to: 100
nvox set to: 64
max_num_samples set to: max
dannce_finetune_weights set to: C:\Users\realtime\dannce\demo\markerless_mouse_1\DANNCE\train_results\AVG
mono set to: True
base_config set to: C:\Users\realtime\dannce\configs\dannce_mouse_config.yaml
viddir set to: videos
crop_height set to: None
crop_width set to: None
camnames set to: None
sigma set to: 10
verbose set to: 1
net set to: None
gpu_id set to: 0
immode set to: vid
mirror set to: False
start_batch set to: 0
start_sample set to: None
com_fromlabels set to: False
medfilt_window set to: None
new_last_kernel_size set to: [3, 3, 3]
new_n_channels_out set to: None
n_layers_locked set to: 2
vmin set to: None
vmax set to: None
interp set to: nearest
depth set to: False
comthresh set to: 0
weighted set to: False
com_method set to: median
cthresh set to: None
channel_combo set to: None
predict_mode set to: torch
dannce_predict_model set to: None
expval set to: None
from_weights set to: None
write_npy set to: None
loss set to: mask_nan_keep_loss
n_channels_in set to: None
extension set to: None
vid_dir_flag set to: None
num_train_per_exp set to: None
chunks set to: None
lockfirst set to: None
load_valid set to: None
augment_hue set to: False
augment_brightness set to: False
augment_hue_val set to: 0.05
augment_bright_val set to: 0.05
augment_rotation_val set to: 5
drop_landmark set to: None
raw_im_h set to: None
raw_im_w set to: None
n_instances set to: 1
use_npy set to: False
data_split_seed set to: None
valid_exp set to: None
metric set to: ['euclidean_distance_3D']
lr set to: 0.001
rotate set to: True
augment_continuous_rotation set to: False
com_thresh set to: None
cam3_train set to: None
debug_volume_tifdir set to: None
downfac set to: None
dannce_predict_vol_tifdir set to: None
n_rand_views set to: 0
rand_view_replace set to: True
multi_gpu_train set to: False
Using the following *dannce.mat files: .\20210610_091000_Label3D_dannce.mat
Setting vid_dir_flag to True.
Setting extension to .avi.
Setting chunks to {'Camera1': array([0]), 'Camera2': array([0]), 'Camera3': array([0]), 'Camera4': array([0]), 'Camera5': array([0])}.
Setting n_channels_in to 3.
Setting raw_im_h to 600.
Setting raw_im_w to 960.
Setting expval to True.
Setting net to finetune_AVG.
Setting crop_height to [0, 600].
Setting crop_width to [0, 960].
Setting maxbatch to max.
Setting start_batch to 0.
Setting vmin to -50.0.
Setting vmax to 50.0.
Using the following *dannce.mat files: .\20210610_091000_Label3D_dannce.mat
Using torch predict mode
Using camnames: ['Camera1', 'Camera2', 'Camera3', 'Camera4', 'Camera5']
The length of the camnames list must divide evenly into 6. Duplicate a subset of the views starting from the first camera (y/n)?y
Duping camnames. Changed from ['Camera1', 'Camera2', 'Camera3', 'Camera4', 'Camera5'] to ['Camera1', 'Camera2', 'Camera3', 'Camera4', 'Camera5', 'Camera1']
Experiment 0 using com3d: E:\DANNCE_test_210608\COM\predict_results\com3d.mat
Removed 0 samples from the dataset because they either had COM positions over cthresh, or did not have matching sampleIDs in the COM file
Saving 3D COM to .\DANNCE\predict_results\com3d_used.mat
None
2021-08-24 08:09:30.244549: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-24 08:09:30.265855: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x20fc559dd10 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-24 08:09:30.265962: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-08-24 08:09:30.270627: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2021-08-24 08:09:30.321214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:65:00.0 name: TITAN RTX computeCapability: 7.5
coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 625.94GiB/s
2021-08-24 08:09:30.321488: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-08-24 08:09:30.322502: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2021-08-24 08:09:30.323052: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2021-08-24 08:09:30.323567: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2021-08-24 08:09:30.324076: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2021-08-24 08:09:30.324680: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2021-08-24 08:09:30.325197: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2021-08-24 08:09:30.325810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-08-24 08:09:30.861918: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-24 08:09:30.862148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2021-08-24 08:09:30.863211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2021-08-24 08:09:30.864124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11059 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:65:00.0, compute capability: 7.5)
2021-08-24 08:09:30.868395: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x210144022c0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-08-24 08:09:30.868489: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5
Init took 11.387974739074707 sec.
Initializing Network...
Loading model from .\DANNCE\train_results\AVG\weights.1056-11.55619.hdf5
Predicting on batch 0
Loading new video: videos\Camera1\0.avi for 0_Camera1
Loading new video: videos\Camera1\0.avi for 0_Camera1
Loading new video: videos\Camera2\0.avi for 0_Camera2
Loading new video: videos\Camera3\0.avi for 0_Camera3
Loading new video: videos\Camera5\0.avi for 0_Camera5
Loading new video: videos\Camera4\0.avi for 0_Camera4
Traceback (most recent call last):
File "C:\Users\realtime\anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in
sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')())
File "c:\users\realtime\dannce\dannce\cli.py", line 54, in dannce_predict_cli
dannce_predict(params)
File "c:\users\realtime\dannce\dannce\interface.py", line 1577, in dannce_predict
n_chn,
File "c:\users\realtime\dannce\dannce\engine\inference.py", line 696, in infer_dannce
ims = generator.getitem(i)
File "c:\users\realtime\dannce\dannce\engine\generator.py", line 966, in getitem
X, y = self.__data_generation(list_IDs_temp)
File "c:\users\realtime\dannce\dannce\engine\generator.py", line 1258, in __data_generation
result = self.threadpool.starmap(self.project_grid, arglist)
File "C:\Users\realtime\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\realtime\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 657, in get
raise self._value
File "C:\Users\realtime\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\realtime\anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "c:\users\realtime\dannce\dannce\engine\generator.py", line 1028, in project_grid
extension=self.extension,
File "c:\users\realtime\dannce\dannce\engine\video.py", line 231, in load_vid_frame
self.currvideo[camname].close() if self.predict_flag else
File "C:\Users\realtime\anaconda3\envs\dannce\lib\site-packages\imageio\core\format.py", line 259, in close
self._close()
File "C:\Users\realtime\anaconda3\envs\dannce\lib\site-packages\imageio\plugins\ffmpeg.py", line 343, in _close
self._read_gen.close()
ValueError: generator already executing

So, I suppose one fix could be to get the pretrained weights for 5cameras, or to fix this error in the processing. Let me know how best I can proceed. Looking forward to your response. When I say
n_views = 5
in the cofig file I of course get the dimension mismatch

[ValueError: Error when checking input: expected input_3 to have shape (64, 64, 64, 6) but got array with shape (64, 64, 64, 5)]

since it was trained on pseudo 6 camera data

Thanks and looking forward to your response!

@spoonsso
8000
Copy link
Owner

Sorry, this seems to be a common issue -- seems to stem from when we transitioned to loading in videos across multiple threads.

@harshk95 if you want to try predicting on the model you've already trained, the workaround would be to manually duplicate the Camera1 videos, put them in a new Camera6 folder inside videos, and update your label3d_dannce.mat camnames to be {'Camera1','Camera2','Camera3','Camera4','Camera5','Camera6'}.

If you want to do a new finetune, here are some pretrained 5 cam mono weights:
pre-trained AVG: https://www.dropbox.com/s/c4o7nd9wy7191la/weights_multigpu-v9.11-11.99217_singleGPU.hdf5?dl=0
pre-trained MAX: https://www.dropbox.com/s/j5hgq2241247yo3/weights_multigpu.30-0.00003.hdf5?dl=0

Note that for some recent 5-camera mono experiments we have been running on mice, the best setting is:

  • AVG finetune starting from pretrained MAX weights.
  • Using mask_nan_l1_loss
  • Setting augment_brightness: True
  • Setting n_rand_views: None

We are wrapping up a more complete grid search over parameters for this 5-camera mono case and will keep you posted!

@harshk95
Copy link

Hi, if I simply duplicate the camera and cam_names I get the following

2021-08-27 08:22:15.577775: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
io_config not found in io.yaml file, falling back to main config
n_views not found in io.yaml file, falling back to main config
n_channels_out not found in io.yaml file, falling back to main config
batch_size not found in io.yaml file, falling back to main config
epochs not found in io.yaml file, falling back to main config
net_type not found in io.yaml file, falling back to main config
train_mode not found in io.yaml file, falling back to main config
num_validation_per_exp not found in io.yaml file, falling back to main config
vol_size not found in io.yaml file, falling back to main config
nvox not found in io.yaml file, falling back to main config
max_num_samples not found in io.yaml file, falling back to main config
dannce_finetune_weights not found in io.yaml file, falling back to main config
mono not found in io.yaml file, falling back to main config
com_train_dir set to: .\COM\train_results
com_predict_dir set to: .\COM\predict_results
com_file set to: E:\DANNCE_test_210608\COM\predict_results\com3d.mat
dannce_train_dir set to: .\DANNCE\train_results\AVG
dannce_predict_dir set to: .\DANNCE\predict_results
exp set to: [{'label3d_file': 'E:/DANNCE_test_210608/6_cam_20210610_091000_Label3D_dannce.mat', 'com_file': 'E:/DANNCE_test_210608/COM/predict_results/com3d.mat'}]
io_config set to: io.yaml
n_views set to: 6
n_channels_out set to: 22
batch_size set to: 4
epochs set to: 1200
net_type set to: AVG
train_mode set to: finetune
num_validation_per_exp set to: 4
vol_size set to: 100
nvox set to: 64
max_num_samples set to: max
dannce_finetune_weights set to: C:\Users\realtime\dannce\demo\markerless_mouse_1\DANNCE\train_results\AVG
mono set to: True
base_config set to: C:\Users\realtime\dannce\configs\dannce_mouse_config.yaml
viddir set to: videos
crop_height set to: None
crop_width set to: None
camnames set to: None
sigma set to: 10
verbose set to: 1
net set to: None
gpu_id set to: 0
immode set to: vid
mirror set to: False
start_batch set to: 0
start_sample set to: None
com_fromlabels set to: False
medfilt_window set to: None
new_last_kernel_size set to: [3, 3, 3]
new_n_channels_out set to: None
n_layers_locked set to: 2
vmin set to: None
vmax set to: None
interp set to: nearest
depth set to: False
comthresh set to: 0
weighted set to: False
com_method set to: median
cthresh set to: None
channel_combo set to: None
predict_mode set to: torch
dannce_predict_model set to: None
expval set to: None
from_weights set to: None
write_npy set to: None
loss set to: mask_nan_keep_loss
n_channels_in set to: None
extension set to: None
vid_dir_flag set to: None
num_train_per_exp set to: None
chunks set to: None
lockfirst set to: None
load_valid set to: None
augment_hue set to: False
augment_brightness set to: False
augment_hue_val set to: 0.05
augment_bright_val set to: 0.05
augment_rotation_val set to: 5
drop_landmark set to: None
raw_im_h set to: None
raw_im_w set to: None
n_instances set to: 1
use_npy set to: False
data_split_seed set to: None
valid_exp set to: None
metric set to: ['euclidean_distance_3D']
lr set to: 0.001
rotate set to: True
augment_continuous_rotation set to: False
com_thresh set to: None
cam3_train set to: None
debug_volume_tifdir set to: None
downfac set to: None
dannce_predict_vol_tifdir set to: None
n_rand_views set to: 0
rand_view_replace set to: True
multi_gpu_train set to: False
Using the following *dannce.mat files: .\6_cam_20210610_091000_Label3D_dannce.mat
Setting vid_dir_flag to True.
Setting extension to .avi.
Setting chunks to {'Camera1': array([0]), 'Camera2': array([0]), 'Camera3': array([0]), 'Camera4': array([0]), 'Camera5': array([0]), 'Camera6': array([0])}.
Setting n_channels_in to 3.
Setting raw_im_h to 600.
Setting raw_im_w to 960.
Setting expval to True.
Setting net to finetune_AVG.
Setting crop_height to [0, 600].
Setting crop_width to [0, 960].
Setting maxbatch to max.
Setting start_batch to 0.
Setting vmin to -50.0.
Setting vmax to 50.0.
Using the following *dannce.mat files: .\6_cam_20210610_091000_Label3D_dannce.mat
Using torch predict mode
Using camnames: ['Camera1', 'Camera2', 'Camera3', 'Camera4', 'Camera5', 'Camera6']
Traceback (most recent call last):
File "C:\Users\realtime\anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in
sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')())
File "c:\users\realtime\dannce\dannce\cli.py", line 54, in dannce_predict_cli
dannce_predict(params)
File "c:\users\realtime\dannce\dannce\interface.py", line 1319, in dannce_predict
training=False,
File "c:\users\realtime\dannce\dannce\interface.py", line 1642, in do_COM_load
exp, prediction=False if training else True, nanflag=False
File "c:\users\realtime\dannce\dannce\engine\serve_data_DANNCE.py", line 48, in prepare_data
cameras = {name: params[i] for i, name in enumerate(CONFIG_PARAMS["camnames"])}
File "c:\users\realtime\dannce\dannce\engine\serve_data_DANNCE.py", line 48, in
cameras = {name: params[i] for i, name in enumerate(CONFIG_PARAMS["camnames"])}
IndexError: list index out of range

So I guess I have to duplicate all the parameters etc too. In any case, I will use the new weights that you have provided, but it was just so I could get an idea.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
0