Triton Inference Server: Transformation Plugin or Inference Extension - nvidia llm-router demo

i.e. should the transformation plugin be used with AI gateway or what is the framework to create a custom transformation plugin for below?

Client sends OpenAI-compatible request to the gateway
Gateway needs to transform this request into a format accepted by Triton Inference Server
Specifically, we need to extract the last user message from the request and format it as:

{
  "inputs": [
    {
      "name": "INPUT",
      "datatype": "BYTES",
      "shape": [1, 1],
      "data": [["User message content"]]
    }
  ]
}

Trying to understand the functionality of Envoy AI Gateway and how it can be consumed if it is already exist more concrete examples will help?

The https://github.com/NVIDIA-AI-Blueprints/llm-router project uses a single triton-server so test example probably will be more simple than a production version

Provide feedback