8000 enhancement(http_client source): allow VRL in query parameters by benjamin-awd · Pull Request #22706 · vectordotdev/vector · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

enhancement(http_client source): allow VRL in query parameters #22706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

benjamin-awd
Copy link
Contributor
@benjamin-awd benjamin-awd commented Mar 21, 2025

Summary

This PR adds support for VRL in query parameters within the http client source.

This is useful for making GET requests with dynamic parameters e.g. a start_time parameter which can help to filter extraneous data (e.g. only get "active" events based on the current timestamp, or events from the last hour)

e.g.

sources:
  http:
    type: http_client
    endpoint: https://endpoint.com
    method: GET
    query:
      timestamp: "{{ now() }}"

Closes #14621

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Tests have been added, but otherwise we've been running this in a Kubernetes environment with Helm.

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Please read our Vector contributor resources.
    • make check-all is a good command to run locally. This check is
      defined here. Some of these
      checks might not be relevant to your PR. For Rust changes, at the very least you should run:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)

References

@benjamin-awd benjamin-awd requested a review from a team as a code owner March 21, 2025 16:32
@github-actions github-actions bot added the domain: sources Anything related to the Vector's sources label Mar 21, 2025
@benjamin-awd benjamin-awd marked this pull request as draft March 27, 2025 02:09
@benjamin-awd benjamin-awd force-pushed the http-client-source-vrl branch from cb79433 to 834d166 Compare March 27, 2025 14:56
@benjamin-awd benjamin-awd marked this pull request as ready for review March 27, 2025 14:58
@benjamin-awd benjamin-awd force-pushed the http-client-source-vrl branch 2 times, most recently from 5b913e2 to ff535bb Compare March 27, 2025 15:50
@benjamin-awd benjamin-awd force-pushed the http-client-source-vrl branch from ff535bb to 4407fff Compare March 27, 2025 15:52
@benjamin-awd benjamin-awd requested a review from a team as a code owner April 16, 2025 15:32
@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Apr 16, 2025
@pront pront requested a review from Copilot April 16, 2025 15:51
Copy link
Contributor
@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • website/cue/reference/components/sources/base/http_client.cue: Language not supported
Comments suppressed due to low confidence (3)

src/sources/http_client/client.rs:398

  • [nitpick] Consider renaming 'process_query_value' to 'evaluate_query_parameter' to more clearly indicate that the function evaluates dynamic VRL expressions.
fn process_query_value(value: &QueryParameterValue) -> QueryParameterValue {

src/sources/http_client/client.rs:241

  • If VRL expression compilation fails, the warning is logged and the parameter is silently returned unevaluated. Consider whether it is preferable to fallback to the original input or signal a compilation failure to prevent unexpected behavior.
if let Ok(value) = Runtime::default().resolve(&mut target, &program, &timezone) {

src/sources/http_client/tests.rs:231

  • Consider adding a test case for scenarios where VRL expression compilation fails to ensure the system gracefully handles such errors.
/// VRL query parameters should be parsed correctly

prelude::TypeState,
};

static RE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\{\{(?P<key>.+)\}\}").unwrap());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, this is used to look for strings like {{ now() }} and capture now(). Do I understand this correctly?

This can only change if the config changes. We should be able to compile once during build and evaluate multiple times.

Copy link
Contributor Author
@benjamin-awd benjamin-awd Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's correct - I realized that I also was unnecessarily compiling the program on each run so I've made some modifications so that both the regex and VRL program are compiled before runtime

@pront pront added this pull request to the merge queue Apr 28, 2025
@pront pront removed this pull request from the merge queue due to a manual request Apr 28, 2025
@pront pront added the source: http_client Anything `http_client` sour 8000 ce related label May 1, 2025
///
/// VRL functions are supported within query parameters. You can
/// use functions like `now()` to dynamically modify query
/// parameter values. The VRL function must be wrapped in `{{ ... }}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is starting to look great. One issue that needs to be addressed is this configuration setting. Relying on this arbitrary syntax {{ ... }} is a bit unorthodox. We can draw inspiration from https://vector.dev/docs/reference/configuration/transforms/filter/#condition. In this case, the QueryParameterValue can be associated with a type string or vrl and I believe this can be done in a backwards compatible way. This way we won't have to use regexes and custom syntax.

Copy link
Contributor Author
@benjamin-awd benjamin-awd May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pront would something like this work?

sources:
  http:
    type: http_client
    endpoint: https://endpoint.com
    method: GET
    vrl_query_keys: ["timestamp"]
    query: 
      timestamp: "now()"
      foo: "bar"

or otherwise modify the keys in the query to support some kind of additional configuration (but not sure if it's possible to maintain backwards compatibility this way)

sources:
  http:
    type: http_client
    endpoint: https://endpoint.com
    method: GET
    query: 
      timestamp: 
         value: "now()"
         type: "vrl"
      foo:
         value: "bar"
         type: "string"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter is what I had in mind. It should be possible to maintain backwards compatibility. We can make "string" the default and skip serialization if the type isn't specified. This way existing configs should keep working. There should be examples of this in other configs. But happy to help if it's not clear how to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @pront, sorry for the late reply on this -- unfortunately I wasn't able to figure it out

Not sure if this is the correct approach:

#[derive(Clone, Debug, Eq, PartialEq, serde::Deserialize, serde::Serialize)]
#[serde(untagged)]
pub enum QueryParameterValue {
    /// Query parameter with single string
    SingleParam(String),

    /// Query parameter with multiple values
    MultiParams(Vec<String>),

    /// Structured query parameter with type and single value
    TypedSingleParam {
        value: String,
        #[serde(default = "default_type", skip_serializing_if = "is_default_type")]
        r#type: String,
    },

    /// Structured query parameter with type and multiple values
    TypedMultiParam {
        value: Vec<String>,
        #[serde(default = "default_type", skip_serializing_if = "is_default_type")]
        r#type: String,
    },
}

fn default_type() -> String {
    "string".to_string()
}

fn is_default_type(ty: &str) -> bool {
    ty == "string"
}

Otherwise some kind of custom deserializer?

@pront pront added the meta: awaiting author Pull requests that are awaiting their author. label May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: external docs Anything related to Vector's external, public documentation domain: sources Anything related to the Vector's sources meta: awaiting author Pull requests that are awaiting their author. source: http_client Anything `http_client` source related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Templating query arguments for http_scrape
3 participants
0