8000 LoraQKV Model部分,推理报AttributeError: 'NoneType' object has no attribute 'T' 错误 · Issue #19 · fxmeng/TransMLA · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
LoraQKV Model部分,推理报AttributeError: 'NoneType' object has no attribute 'T' 错误 #19
Open
@zimei11

Description

@zimei11

ds_config_zero3.json LICENSE main.py README.md scripts src train.py │··········································································································
(transmla) lixishi@node01:~/TransMLA$ python main.py --model-path /home/lixishi/llm_model/Llama-2-7b-hf/ --ppl-eva│··········································································································
l-batch-size 8 --dim2head 4 --qk-mqa-dim 128 --q-lora-rank 512 --kv-lora-rank 896 │··········································································································
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████| 2/2 [00:10<00:00, 5.32s/it]│··········································································································
100%|██████████████████████████████████████████████████████████████████████████████| 16/16 [00:20<00:00, 1.28s/it]│··········································································································
++++++++++Original Model:++++++++++ │··········································································································
100%|██████████████████████████████████████████████████████████████████████████████| 21/21 [01:16<00:00, 3.64s/it]│··········································································································
Original ppl: 5.4734 │··········································································································
++++++++++RemoveRope Model:++++++++++ │··········································································································
100%|██████████████████████████████████████████████████████████████████████████████| 16/16 [00:21<00:00, 1.37s/it]│··········································································································
100%|██████████████████████████████████████████████████████████████████████████████| 21/21 [02:32<00:00, 7.27s/it]│··········································································································
Remove RoPE ppl: 8.9670 │··········································································································
++++++++++LoraQKV Model:++++++++++ │··········································································································
Traceback (most recent call last): │··········································································································
File "/home/lixishi/TransMLA/main.py", line 111, in │··········································································································
main(args) │··········································································································
File "/home/lixishi/TransMLA/main.py", line 86, in main │··········································································································
setattr(layer, "self_attn",LoraQKV( │··········································································································
^^^^^^^^ │··········································································································
File "/home/lixishi/TransMLA/src/lora_qkv.py", line 103, in init │··········································································································
self.init_deepseek(self_attn, R_q, R_kv) │··········································································································
File "/home/lixishi/TransMLA/src/lora_qkv.py", line 108, in init_deepseek │··········································································································
q_a_weight = (R_q.T@self_attn.q_proj.weight.data.to(torch.float64))[:self.q_lora_rank].to(self.dtype) │··········································································································
^^^^^ │··········································································································
AttributeError: 'NoneType' object has no attribute 'T'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0