8000 TypeError: zip argument #1 must support iteration · Issue #87 · lzccccc/SMOKE · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
TypeError: zip argument #1 must support iteration #87
Open
@LordonCN

Description

@LordonCN

Hello,
When I use command python tools/plain_train_net.py --config-file configs/train_val_bs16_normal_conv.yaml the training stage is fine, but when I try to use multi gpus to train it occus:

python tools/plain_train_net.py --config-file configs/train_val_bs16_normal_conv.yaml --num-gpus 2 --num-machines 1

`
-02 20:15:24,729] smoke.data.datasets.kitti INFO: Initializing KITTI train set with 3712 files loaded
[2023-03-02 20:15:24,775] smoke.trainer INFO: Start training
Traceback (most recent call last):
File "tools/plain_train_net.py", line 107, in
args=(args,),
File "/home/wangguojun//test/SMOKE/smoke/engine/launch.py", line 53, in launch
daemon=False,
File "/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/wangguojun//test/SMOKE/smoke/engine/launch.py", line 88, in _distributed_worker
main_func(*args)
File "/home/wangguojun//test/SMOKE/tools/plain_train_net.py", line 95, in main
train(cfg, model, device, distributed)
File "/hoe/wangguojun//test/SMOKE/tools/plain_train_net.py", line 57, in train
tb_log
File "/home/wangguojun//test/SMOKE/smoke/e 535E ngine/trainer.py", line 73, in do_train
for data, iteration in zip(data_loader, range(start_iter, max_iter)):

TypeError: zip argument #1 must support iteration

(smoke) wangguojun@pc:~//test/SMOKE$ Traceback (most recent call last):
File "", line 1, in
File "/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
_pickle.UnpicklingError: pickle data was truncated
/home/wangguojun/miniconda3/envs/smoke/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 14 leaked semaphores to clean up at shutdown
len(cache))
`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0