-
Notifications
You must be signed in to change notification settings - Fork 1.8k
enhancement(http_client source): allow VRL in query parameters #22706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
enhancement(http_client source): allow VRL in query parameters #22706
Conversation
cb79433
to
834d166
Compare
5b913e2
to
ff535bb
Compare
ff535bb
to
4407fff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.
Files not reviewed (1)
- website/cue/reference/components/sources/base/http_client.cue: Language not supported
Comments suppressed due to low confidence (3)
src/sources/http_client/client.rs:398
- [nitpick] Consider renaming 'process_query_value' to 'evaluate_query_parameter' to more clearly indicate that the function evaluates dynamic VRL expressions.
fn process_query_value(value: &QueryParameterValue) -> QueryParameterValue {
src/sources/http_client/client.rs:241
- If VRL expression compilation fails, the warning is logged and the parameter is silently returned unevaluated. Consider whether it is preferable to fallback to the original input or signal a compilation failure to prevent unexpected behavior.
if let Ok(value) = Runtime::default().resolve(&mut target, &program, &timezone) {
src/sources/http_client/tests.rs:231
- Consider adding a test case for scenarios where VRL expression compilation fails to ensure the system gracefully handles such errors.
/// VRL query parameters should be parsed correctly
src/sources/http_client/client.rs
Outdated
prelude::TypeState, | ||
}; | ||
|
||
static RE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\{\{(?P<key>.+)\}\}").unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, this is used to look for strings like {{ now() }}
and capture now()
. Do I understand this correctly?
This can only change if the config changes. We should be able to compile once during build and evaluate multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's correct - I realized that I also was unnecessarily compiling the program on each run so I've made some modifications so that both the regex and VRL program are compiled before runtime
/// | ||
/// VRL functions are supported within query parameters. You can | ||
/// use functions like `now()` to dynamically modify query | ||
/// parameter values. The VRL function must be wrapped in `{{ ... }}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is starting to look great. One issue that needs to be addressed is this configuration setting. Relying on this arbitrary syntax {{ ... }}
is a bit unorthodox. We can draw inspiration from https://vector.dev/docs/reference/configuration/transforms/filter/#condition. In this case, the QueryParameterValue
can be associated with a type string
or vrl
and I believe this can be done in a backwards compatible way. This way we won't have to use regexes and custom syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pront would something like this work?
sources:
http:
type: http_client
endpoint: https://endpoint.com
method: GET
vrl_query_keys: ["timestamp"]
query:
timestamp: "now()"
foo: "bar"
or otherwise modify the keys in the query to support some kind of additional configuration (but not sure if it's possible to maintain backwards compatibility this way)
sources:
http:
type: http_client
endpoint: https://endpoint.com
method: GET
query:
timestamp:
value: "now()"
type: "vrl"
foo:
value: "bar"
type: "string"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter is what I had in mind. It should be possible to maintain backwards compatibility. We can make "string" the default and skip serialization if the type
isn't specified. This way existing configs should keep working. There should be examples of this in other configs. But happy to help if it's not clear how to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pront, sorry for the late reply on this -- unfortunately I wasn't able to figure it out
Not sure if this is the correct approach:
#[derive(Clone, Debug, Eq, PartialEq, serde::Deserialize, serde::Serialize)]
#[serde(untagged)]
pub enum QueryParameterValue {
/// Query parameter with single string
SingleParam(String),
/// Query parameter with multiple values
MultiParams(Vec<String>),
/// Structured query parameter with type and single value
TypedSingleParam {
value: String,
#[serde(default = "default_type", skip_serializing_if = "is_default_type")]
r#type: String,
},
/// Structured query parameter with type and multiple values
TypedMultiParam {
value: Vec<String>,
#[serde(default = "default_type", skip_serializing_if = "is_default_type")]
r#type: String,
},
}
fn default_type() -> String {
"string".to_string()
}
fn is_default_type(ty: &str) -> bool {
ty == "string"
}
Otherwise some kind of custom deserializer?
Summary
This PR adds support for VRL in query parameters within the http client source.
This is useful for making GET requests with dynamic parameters e.g. a
start_time
parameter which can help to filter extraneous data (e.g. only get "active" events based on the current timestamp, or events from the last hour)e.g.
Closes #14621
Change Type
Is this a breaking change?
How did you test this PR?
Tests have been added, but otherwise we've been running this in a Kubernetes environment with Helm.
Does this PR include user facing changes?
Checklist
make check-all
is a good command to run locally. This check isdefined here. Some of these
checks might not be relevant to your PR. For Rust changes, at the very least you should run:
cargo fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo nextest run --workspace
(alternatively, you can runcargo test --all
)References