8000 Content-length header being removed, force-switching the request to chunked transfer endcoding · Issue #721 · envoyproxy/ai-gateway · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Content-length header being removed, force-switching the request to chunked transfer endcoding #721
Open
@sukumargaonkar

Description

@sukumargaonkar

Description:

What issue is being seen? Describe what should be happening instead of
the bug, for example: Envoy should not crash, the expected value isn't
returned, etc.

ProcessRequestHeaders() in UpstreamFilter extproc explicitly sets the return status as CONTINUE_AND_REPLACE here. This results in envoy dropping the content-length header causing the request to switch to chunked-transfer-encoding. Refer envoy-code

Anthropic models hosted on GCP do not support chunked-transfer-encoding and result in below error

{
  "error": {
    "code": 400,
    "message": "Prediction on deployed model (endpoint_id: <redacted-endpoint-id>, deployed_model_id: <redacted-model-id>) failed with error: \"Bad Request\".",
    "status": "INVALID_ARGUMENT"
  }
}

Expected Behavior:
content-length header should not be unilaterally removed by envoy + envoy-ai-gateway

Repro steps:

Include sample requests, environment, etc. All data and inputs
required to reproduce the bug.

gcp-anthropic model throwing INVALID_ARGUMENT when chunked-transfer-encoding header is set
Note: transfer-encoding: chunked header is set

curl --request POST \
  --url https://us-east5-aiplatform.googleapis.com/v1/projects/<PROJECT-NAME>/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku@20241022:rawPredict \
  --header 'authorization: Bearer <TOKEN>' \
  --header 'content-type: application/json' \
  --header 'transfer-encoding: chunked' \
  --data '{"anthropic_version": "vertex-2023-10-16","messages": [{"role": "user","content": [{"type": "text","text": "What are you doing?"}]}],"max_tokens": 256,"stream": false}'

Output

{
  "error": {
    "code": 400,
    "message": "Prediction on deployed model (endpoint_id: <redacted-endpoint-id>, deployed_model_id: <redacted-model-id>) failed with error: \"Bad Request\".",
    "status": "INVALID_ARGUMENT"
  }
}

Valid response when content-length header is set
Note: transfer-encoding header is NOT set

curl --request POST \
  --url https://us-east5-aiplatform.googleapis.com/v1/projects/<PROJECT-NAME>/locations/us-east5/publishers/anthropic/models/claude-3-5-haiku@20241022:rawPredict \
  --header 'authorization: Bearer <TOKEN>' \
  --header 'content-type: application/json' \
  --header 'user-agent: vscode-restclient' \
  --data '{"anthropic_version": "vertex-2023-10-16","messages": [{"role": "user","content": [{"type": "text","text": "What are you doing?"}]}],"max_tokens": 256,"stream": false}'

Output

{
  "id": "<redacted-response-id>",
  "type": "message",
  "role": "assistant",
  "model": "claude-3-5-haiku-20241022",
  "content": [
    {
      "type": "text",
      "text": "I want to be direct with you. I aim to help you by listening and responding to whatever task or conversation you would like to have. How can I assist you today?"
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "output_tokens": 38
  }
}

Note: If there are privacy concerns, sanitize the data prior to
sharing.

Environment:

Include the environment like gateway version, envoy version and so on.

Logs:

Include the access logs and the Envoy logs.

Thanks @cmaddalozzo for helping investigate

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0