Export of Llama2 fails

I'm unable to use exporters for meta-llama/Llama-2-7b-chat-hf model.

Here is my command

python -m exporters.coreml --model=meta-llama/Llama-2-7b-chat-hf models/llama2.mlpackage

And here is the output

 % python -m exporters.coreml --model=meta-llama/Llama-2-7b-chat-hf models/llama2.mlpackage
Torch version 2.3.0 has not been tested with coremltools. You may run into unexpected errors. Torch 2.2.0 is the most recent version that has been tested.
/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:30<00:00, 15.44s/it]
Using framework PyTorch: 2.3.0
Overriding 1 configuration item(s)
	- use_cache -> False
/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/transformers/modeling_utils.py:4371: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
The cos_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class
The sin_cached attribute will be removed in 4.39. Bear in mind that its contents changed in v4.38. Use the forward method of RoPE from now on instead. It is not used in the `LlamaAttention` class
/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1094: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if sequence_length != 1:
Skipping token_type_ids input
Converting PyTorch Frontend ==> MIL Ops:   0%|                                                                                                                                             | 0/3690 [00:00<?, ? ops/s]Saving value type of int64 into a builtin type of int32, might lose precision!


ERROR - converting 'full' op (located at: 'model'):

Converting PyTorch Frontend ==> MIL Ops:   1%|▉                                                                                                                                 | 28/3690 [00:00<00:00, 5249.21 ops/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/user/LLAMA2/exporters/src/exporters/coreml/__main__.py", line 178, in <module>
    main()
  File "/Users/user/LLAMA2/exporters/src/exporters/coreml/__main__.py", line 166, in main
    convert_model(
  File "/Users/user/LLAMA2/exporters/src/exporters/coreml/__main__.py", line 45, in convert_model
    mlmodel = export(
              ^^^^^^^
  File "/Users/user/LLAMA2/exporters/src/exporters/coreml/convert.py", line 660, in export
    return export_pytorch(preprocessor, model, config, quantize, compute_units)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/LLAMA2/exporters/src/exporters/coreml/convert.py", line 553, in export_pytorch
    mlmodel = ct.convert(
              ^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/_converters_entry.py", line 581, in convert
    mlmodel = mil_convert(
              ^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 188, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 212, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 288, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 108, in __call__
    return load(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 82, in load
    return _perform_torch_convert(converter, debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 116, in _perform_torch_convert
    prog = converter.convert()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 581, in convert
    convert_nodes(self.context, self.graph)
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 86, in convert_nodes
    raise e     # re-raise exception
    ^^^^^^^
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 81, in convert_nodes
    convert_single_node(context, node)
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 134, in convert_single_node
    add_op(context, node)
  File "/Users/user/anaconda3/envs/hf-exporters/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 4211, in full
    else NUM_TO_NUMPY_DTYPE[TORCH_DTYPE_TO_NUM[inputs[2].val]]
                            ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 6

I was able to generate a mlpackage for distilbert-base-uncased-finetuned-sst-2-english, with this comm 50CB and: python -m exporters.coreml --model=distilbert-base-uncased-finetuned-sst-2-english --feature=sequence-classification models/defaults.mlpackage, so I have some confidence that the environment is correct and working.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions