8000 Only convert inputs to FP16 when FP16 stage is used by danielholanda · Pull Request #335 · groq/mlagility · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Only convert inputs to FP16 when FP16 stage is used #335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 4, 2023

Conversation

danielholanda
Copy link
Contributor
@danielholanda danielholanda commented Jun 23, 2023

Closes #336

Description

This PR ensures that

  • inputs are ONLY converted to FP16 if we are running the FP16 stage.
  • expected_input_dtypes are updated when the FP16 stage is used.

Testing

A test was added to build_model.py. You can also test the changes using the script below:

# labels: name::linear author::selftest test_group::selftest
import torch
import numpy as np
import os
from groqflow import groqit
import onnxflow.justbuildit.export as export
import onnxflow.justbuildit.stage as stage
from onnxflow.justbuildit.buildit import build_model

torch.manual_seed(0)
cache_location = ".cache/onnxflow_test_cache"


custom_sequence_fp32 = stage.Sequence(
    "custom_sequence_fp32",
    "Building Pytorch Model without fp16 conversion",
    [
        export.ExportPytorchModel(),
        export.OptimizeOnnxModel(),
    ],
    enable_model_validation=True,
)

custom_sequence_fp16 = stage.Sequence(
    "custom_sequence_fp16",
    "Building Pytorch Model with fp16 conversion",
    [
        export.ExportPytorchModel(),
        export.OptimizeOnnxModel(),
        export.ConvertOnnxToFp16(),
    ],
    enable_model_validation=True,
)

torch.manual_seed(0)

# Define model class
class SmallModel(torch.nn.Module):
    def __init__(self, input_size, output_size):
        super(SmallModel, self).__init__()
        self.fc = torch.nn.Linear(input_size, output_size)

    def forward(self, x):
        output = self.fc(x)
        return output


# Instantiate model and generate inputs
input_size = 10
output_size = 5
pytorch_model = SmallModel(input_size, output_size)
inputs = {"x": torch.rand(input_size)}

# Build model using fp32 inputs
build_name = "custom_sequence_fp32"
omodel = build_model(
    pytorch_model,
    inputs,
    build_name=build_name,
    rebuild="always",
    monitor=False,
    cache_dir=cache_location,
    sequence=custom_sequence_fp32,
)

inputs_path = os.path.join(cache_location, build_name, "inputs.npy")
assert np.load(inputs_path, allow_pickle=True)[0]["x"].dtype == np.float32

# Build model using fp16 inputs
build_name = "custom_sequence_fp16"
omodel = build_model(
    pytorch_model,
    inputs,
    build_name="custom_sequence_fp16",
    rebuild="always",
    monitor=False,
    cache_dir=cache_location,
    sequence=custom_sequence_fp16,
)

inputs_path = os.path.join(cache_location, build_name, "inputs.npy")
assert np.load(inputs_path, allow_pickle=True)[0]["x"].dtype == np.float16

@danielholanda danielholanda self-assigned this Jun 23, 2023
@danielholanda danielholanda marked this pull request as ready for review June 23, 2023 22:27
@danielholanda danielholanda changed the title Only inputs to FP16 when FP16 stage is used Only convert inputs to FP16 when FP16 stage is used Jun 23, 2023
@danielholanda danielholanda added the bug Something isn't working label Jun 23, 2023
Copy link
Contributor
@jeremyfowers jeremyfowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! It's funny that we already had the downcast arg but just never used it.

I approve, but I suggest that you ask @vgodsoe-groq to test the changes within GroqFlow on a GroqNode before merging (just to avoid any potential back and forth additional PRs).

@danielholanda
Copy link
Contributor Author

I have a decision to make about this PR and my Magic 8 ball seems to be broken. Which solution is the most sound here?

Issue:
We are now ONLY converting the inputs to FP16 when the FP16 stage is run. This means that models that use the torch-eager/torch-compiled runtimes will now run in FP32 instead of FP16.
A consequence of this is that benchmarking a model using both ort and torch-eager (assuming the default sequences) will cause results to be overwritten due to the mismatched data types.

Potential solutions:
(1) Convert the inputs to FP16 also when torch-eager/torch-compiled are used
(2) Only check for input shapes, but not data types before building a model. Note: *_state.yaml will only keep track of the expected_input_dtype of the last model built if the code is not meaningfully refactored.
(3) Change the hashing logic to also include input dtypes

@danielholanda danielholanda merged commit d7b9eb0 into main Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging this pull request may close these issues.

Only convert inputs to FP16 when FP16 stage is used
3 participants
0