8000 model_deploy CVAddressAssign.cpp Assertion op_infos.find(cur_info) != op_infos.end() failed · Issue #205 · sophgo/tpu-mlir · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
model_deploy CVAddressAssign.cpp Assertion op_infos.find(cur_info) != op_infos.end() failed #205
Open
@JChunX

Description

@JChunX

Hi, When I try to convert a model to cvimodel, the address assignment step is failing. Can someone please take a look?
I have narrowed it down to my model's preprocessing - in particular, if I have multiplication before any sin, cos ops, conversion will fail.

(minimal reproduction)

class CosNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.cycle_time = 0.5
    def forward(self, t):
        cos_t = torch.cos(2 * torch.pi * t / self.cycle_time)
        return cos_t

model = CosNet()
dummy_input = torch.randn(1)

torch.onnx.export(model,
                 dummy_input,
                 "output.onnx",
                 export_params=True,
                 opset_version=11,
                 do_constant_folding=False,
                 input_names=['t'],
                 output_names=['cos'])

===========================================

root@6252375c5a67:/workspace/artifacts/zbot2_stand# model_transform.py --model_name zbot2_standing_preproc --model_def kinfer_policy_zbot_standing_isaacgym_ep_3001_preproc.onnx --mlir zbot2_standing_preproc.mlir
SOPHGO Toolchain v1.3.228-g19ca95e9-20230921
2025/01/06 09:49:01 - INFO : 
         _____________________________________________________ 
        | preprocess:                                           |
        |   (x - mean) * scale                                  |
        '-------------------------------------------------------'
  config Preprocess args : 
        resize_dims           : same to net input dims
        keep_aspect_ratio     : False
        keep_ratio_mode       : letterbox
        pad_value             : 0
        pad_type              : center
        --------------------------
        mean                  : [0.0, 0.0, 0.0]
        scale                 : [1.0, 1.0, 1.0]
        --------------------------
        pixel_format          : bgr
        channel_format        : nchw

Input_shape assigned
ConstantFolding finished
skip_fuse_bn: False
Onnxsim opt finished
ConstantFolding finished
Save mlir file: zbot2_standing_preproc_origin.mlir
[Running]: tpuc-opt zbot2_standing_preproc_origin.mlir --shape-infer --canonicalize --extra-optimize -o zbot2_standing_preproc.mlir 
[Success]: tpuc-opt zbot2_standing_preproc_origin.mlir --shape-infer --canonicalize --extra-optimize -o zbot2_standing_preproc.mlir 
Mlir file generated:zbot2_standing_preproc.mlir


root@6252375c5a67:/workspace/artifacts/zbot2_stand# model_deploy.py  --mlir zbot2_standing_preproc.mlir  --quantize BF16  --chip cv181x --model zbot2_standing_preproc.cvimodel
SOPHGO Toolchain v1.3.228-g19ca95e9-20230921
[Running]: tpuc-opt zbot2_standing_preproc.mlir --chip-assign="chip=cv181x" --chip-top-optimize --convert-top-to-tpu="mode=BF16  asymmetric=False linear_quant_mode=NORMAL doWinograd=False ignore_f16_overflow=False" --canonicalize -o zbot2_standing_preproc_cv181x_bf16_tpu.mlir 
[Success]: tpuc-opt zbot2_standing_preproc.mlir --chip-assign="chip=cv181x" --chip-top-optimize --convert-top-to-tpu="mode=BF16  asymmetric=False linear_quant_mode=NORMAL doWinograd=False ignore_f16_overflow=False" --canonicalize -o zbot2_standing_preproc_cv181x_bf16_tpu.mlir 
[Running]: tpuc-opt zbot2_standing_preproc_cv181x_bf16_tpu.mlir --mlir-disable-threading --strip-io-quant="quant_input=False quant_output=False" --chip-tpu-optimize --distribute='num_device=1' --weight-reorder  --subnet-divide="dynamic=False" --op-reorder --layer-group="opt=2" --parallel='num_core=1' --address-assign -o zbot2_standing_preproc_cv181x_bf16_final.mlir 
==---------------------------==
Run LayerGroupSearchPass : 
    Searching the optimal layer groups
==---------------------------==

=======================================================
***** Dynamic Programming layer group with cluster ****
=======================================================
total num of base_group is 4
clusters idx(size): 0(1), 
process base group 0, layer_num=1, cluster_num=1
clusters idx(size): 0(1), 1(1), 
process base group 1, layer_num=2, cluster_num=2
Searching best group slices...
[                                                  ] 0%
clusters idx(size): 0(1), 
process base group 2, layer_num=1, cluster_num=1
clusters idx(size): 0(1), 
process base group 3, layer_num=1, cluster_num=1
-------------------------------------------------------
Consider redundant computation and gdma cost
-------------------------------------------------------
-------------------------------------------------------
Merge cut idx to reduce gdma cost
-------------------------------------------------------
==---------------------------==
Run GroupPostTransformPass : 
    Some transform after layer groups is determined
==---------------------------==
==---------------------------==
Run TimeStepAssignmentPass : 
    Assign timestep task for each group.
==---------------------------==
==---------------------------==
Run LocalMemoryAllocationPass : 
    Allocate local memory for all layer groups
==---------------------------==
==---------------------------==
Run TimeStepCombinePass : 
    Combine time step for better parallel balance
==---------------------------==
==---------------------------==
Run GroupDataMoveOverlapPass : 
    Overlap data move between two layer group
==---------------------------==
tpuc-opt: /home/jenkins/workspace/tpu-mlir/lib/Dialect/Tpu/Transforms/AddressAssign/CVAddressAssign.cpp:377: void tpu_mlir::tpu::CVAddressAssign::updateLiveRange(mlir::Operation *, std::map<Operation *, uint32_t> &, std::map<ValueInfo, OpElement> &, std::vector<ValueInfo> &, std::vector<mlir::Value> &, int64_t): Assertion `op_infos.find(cur_info) != op_infos.end()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: tpuc-opt zbot2_standing_preproc_cv181x_bf16_tpu.mlir --init --mlir-disable-threading "--strip-io-quant=quant_input=False quant_output=False" --chip-tpu-optimize --distribute=num_device=1 --weight-reorder --subnet-divide=dynamic=False --op-reorder --layer-group=opt=2 --parallel=num_core=1 --address-assign --deinit --mlir-print-debuginfo -o zbot2_standing_preproc_cv181x_bf16_final.mlir
 #0 0x0000555555b39c37 (/workspace/tpu-mlir/bin/tpuc-opt+0x5e5c37)
 #1 0x0000555555b3795e (/workspace/tpu-mlir/bin/tpuc-opt+0x5e395e)
 #2 0x0000555555b3a5ba (/workspace/tpu-mlir/bin/tpuc-opt+0x5e65ba)
 #3 0x00007ffffbbbd520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007ffffbc11a7c pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x96a7c)
 #5 0x00007ffffbbbd476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #6 0x00007ffffbba37f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #7 0x00007ffffbba371b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
 #8 0x00007ffffbbb4e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
 #9 0x00005555561b56ed (/workspace/tpu-mlir/bin/tpuc-opt+0xc616ed)
#10 0x00005555561b4351 (/workspace/tpu-mlir/bin/tpuc-opt+0xc60351)
#11 0x00005555561a184d (/workspace/tpu-mlir/bin/tpuc-opt+0xc4d84d)
#12 0x00005555568561b4 (/workspace/tpu-mlir/bin/tpuc-opt+0x13021b4)
#13 0x00005555568567e1 (/workspace/tpu-mlir/bin/tpuc-opt+0x13027e1)
#14 0x0000555556858c88 (/workspace/tpu-mlir/bin/tpuc-opt+0x1304c88)
#15 0x0000555555b2b47b (/workspace/tpu-mlir/bin/tpuc-opt+0x5d747b)
#16 0x0000555555b2a844 (/workspace/tpu-mlir/bin/tpuc-opt+0x5d6844)
#17 0x0000555556a5ca08 (/workspace/tpu-mlir/bin/tpuc-opt+0x1508a08)
#18 0x0000555555b24b4a (/workspace/tpu-mlir/bin/tpuc-opt+0x5d0b4a)
#19 0x0000555555b25014 (/workspace/tpu-mlir/bin/tpuc-opt+0x5d1014)
#20 0x0000555555b23d99 (/workspace/tpu-mlir/bin/tpuc-opt+0x5cfd99)
#21 0x00007ffffbba4d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#22 0x00007ffffbba4e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#23 0x0000555555b239f5 (/workspace/tpu-mlir/bin/tpuc-opt+0x5cf9f5)
Aborted
Traceback (most recent call last):
  File "/workspace/tpu-mlir/python/tools/model_deploy.py", line 320, in <module>
    tool.build_model()
  File "/workspace/tpu-mlir/python/tools/model_deploy.py", line 219, in build_model
    mlir_to_model(self.tpu_mlir, self.model, self.final_mlir, self.dynamic,
  File "/workspace/tpu-mlir/python/utils/mlir_shell.py", line 153, in mlir_to_model
    _os_system(cmd)
  File "/workspace/tpu-mlir/python/utils/mlir_shell.py", line 50, in _os_system
    raise RuntimeError("[!Error]: {}".format(cmd_str))
RuntimeError: [!Error]: tpuc-opt zbot2_standing_preproc_cv181x_bf16_tpu.mlir --mlir-disable-threading --strip-io-quant="quant_input=False quant_output=False" --chip-tpu-optimize --distribute='num_device=1' --weight-reorder  --subnet-divide="dynamic=False" --op-reorder --layer-group="opt=2" --parallel='num_core=1' --address-assign -o zbot2_standing_preproc_cv181x_bf16_final.mlir

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0