Only convert inputs to FP16 when FP16 stage is used #335

danielholanda · 2023-06-23T21:59:52Z

Closes #336

Description

This PR ensures that

inputs are ONLY converted to FP16 if we are running the FP16 stage.
expected_input_dtypes are updated when the FP16 stage is used.

Testing

A test was added to build_model.py. You can also test the changes using the script below:

# labels: name::linear author::selftest test_group::selftest
import torch
import numpy as np
import os
from groqflow import groqit
import onnxflow.justbuildit.export as export
import onnxflow.justbuildit.stage as stage
from onnxflow.justbuildit.buildit import build_model

torch.manual_seed(0)
cache_location = ".cache/onnxflow_test_cache"


custom_sequence_fp32 = stage.Sequence(
    "custom_sequence_fp32",
    "Building Pytorch Model without fp16 conversion",
    [
        export.ExportPytorchModel(),
        export.OptimizeOnnxModel(),
    ],
    enable_model_validation=True,
)

custom_sequence_fp16 = stage.Sequence(
    "custom_sequence_fp16",
    "Building Pytorch Model with fp16 conversion",
    [
        export.ExportPytorchModel(),
        export.OptimizeOnnxModel(),
        export.ConvertOnnxToFp16(),
    ],
    enable_model_validation=True,
)

torch.manual_seed(0)

# Define model class
class SmallModel(torch.nn.Module):
    def __init__(self, input_size, output_size):
        super(SmallModel, self).__init__()
        self.fc = torch.nn.Linear(input_size, output_size)

    def forward(self, x):
        output = self.fc(x)
        return output


# Instantiate model and generate inputs
input_size = 10
output_size = 5
pytorch_model = SmallModel(input_size, output_size)
inputs = {"x": torch.rand(input_size)}

# Build model using fp32 inputs
build_name = "custom_sequence_fp32"
omodel = build_model(
    pytorch_model,
    inputs,
    build_name=build_name,
    rebuild="always",
    monitor=False,
    cache_dir=cache_location,
    sequence=custom_sequence_fp32,
)

inputs_path = os.path.join(cache_location, build_name, "inputs.npy")
assert np.load(inputs_path, allow_pickle=True)[0]["x"].dtype == np.float32

# Build model using fp16 inputs
build_name = "custom_sequence_fp16"
omodel = build_model(
    pytorch_model,
    inputs,
    build_name="custom_sequence_fp16",
    rebuild="always",
    monitor=False,
    cache_dir=cache_location,
    sequence=custom_sequence_fp16,
)

inputs_path = os.path.join(cache_location, build_name, "inputs.npy")
assert np.load(inputs_path, allow_pickle=True)[0]["x"].dtype == np.float16

jeremyfowers

This is great! It's funny that we already had the downcast arg but just never used it.

I approve, but I suggest that you ask @vgodsoe-groq to test the changes within GroqFlow on a GroqNode before merging (just to avoid any potential back and forth additional PRs).

src/onnxflow/justbuildit/export.py

danielholanda · 2023-06-28T18:17:23Z

I have a decision to make about this PR and my Magic 8 ball seems to be broken. Which solution is the most sound here?

Issue:
We are now ONLY converting the inputs to FP16 when the FP16 stage is run. This means that models that use the torch-eager/torch-compiled runtimes will now run in FP32 instead of FP16.
A consequence of this is that benchmarking a model using both ort and torch-eager (assuming the default sequences) will cause results to be overwritten due to the mismatched data types.

Potential solutions:
(1) Convert the inputs to FP16 also when torch-eager/torch-compiled are used
(2) Only check for input shapes, but not data types before building a model. Note: *_state.yaml will only keep track of the expected_input_dtype of the last model built if the code is not meaningfully refactored.
(3) Change the hashing logic to also include input dtypes

Only casting when FP16 stage is used

acd13d0

danielholanda self-assigned this Jun 23, 2023

Added test

05d624b

danielholanda marked this pull request as ready for review June 23, 2023 22:27

danielholanda requested review from jeremyfowers, ramkrishna2910 and vgodsoe-groq as code owners June 23, 2023 22:27

danielholanda changed the title ~~Only inputs to FP16 when FP16 stage is used~~ Only convert inputs to FP16 when FP16 stage is used Jun 23, 2023

danielholanda added the bug Something isn't working label Jun 23, 2023

jeremyfowers approved these changes Jun 26, 2023

View reviewed changes

src/onnxflow/justbuildit/export.py Show resolved Hide resolved

Daniel Holanda Noronha added 5 commits June 26, 2023 12:34

Update onnxflow and mlagility version numbers

9ea1c5e

Overwrite expected dtypes when converting to FP16

c13709c

lint

2c5d029

Being flexible when checking data types

93c43d2

Being flexible when checking data types + lint

62d74ff

Daniel Holanda Noronha added 5 commits June 28, 2023 16:26

Added downcast_applied to state

4db50a4

Update _validate_inputs

b3d63d2

Simplify code

f906c80

lint

4a9a8ac

lint

4675521

vgodsoe-groq approved these changes Jun 30, 2023

View reviewed changes

danielholanda merged commit d7b9eb0 into main Jul 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Only convert inputs to FP16 when FP16 stage is used #335

Only convert inputs to FP16 when FP16 stage is used #335

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Only convert inputs to FP16 when FP16 stage is used #335

Only convert inputs to FP16 when FP16 stage is used #335

Uh oh!

Conversation

Uh oh!

Description

Testing

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!