-
Notifications
You must be signed in to change notification settings - Fork 6
SPICE-0012: URL standard library #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
It seems that "search params" are the correct technical wording. However I think if we would go with just Params it would be better understandable... 🤔 Maybe at least introducing an typealias for this? 🤔 Or are search params different to queries? Just some random thoughts 🤷♂️ I've never seen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I Like the design. Left some questions in the comments.
/// The username component. | ||
/// | ||
/// If the URL does not require a username, set to the empty string. | ||
username: AsciiString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason to make the empty string the "null" value instead of proper null
like port
, path
, etc.?
I'd rather go with AsciiString?
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. According to the spec, this field (and several other ones) do not admit null:
https://url.spec.whatwg.org/#url-representation
There are some fields that can be null, but this is not one of them.
Practically, I don't know if this makes any difference.
/// The password component. | ||
/// | ||
/// If the URL does not require a password, set to the empty string. | ||
password: AsciiString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above.
/// Parses [source] into a URL. | ||
/// | ||
/// Throws if [source] is an invalid URL. | ||
external function parse(source: String): Url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with having one version that throws on failure, for when users know their URL is correct. But we should have parseOrNull(source: String): Url?
for when you want to recover from errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our other parsers don't have a parseOrNull
, so, this follows the same precedent (see yaml.Parser
and json.Parser
).
If you need to recover from errors, you can use test.catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After chatting a little bit more on this, I'm convinced that we should just have a parseOrNull
here, updated.
The rationale is: we also have similar "OrNull" methods in other places. It makes it much easier for users; e.g. Uri.parseOrNull(input) ?? Uri.parse("https://example.com")
We should have the YAML and JSON parsers conform to this as a future task.
else null | ||
|
||
/// Creates a [SearchParams] from the given form encoded string. | ||
const function SearchParams(input: String): SearchParams = // etc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible input
is not a valid query string? If so, what happens then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question; I don't think so.
Here is the parsing algorithm: https://url.spec.whatwg.org/#urlencoded-parsing
None of these steps involve failure.
/// ``` | ||
/// encode("https://example.com/some path/") == "https://example.com/some%20path" | ||
/// ``` | ||
const external function encode(value: String): String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const external function encode(value: String): String | |
const external function encodeUri(value: String): String |
encode
is a bit too generic IMO, and it was hard to understand the difference between it and percentEncode
at first glance.
Could also be called encodeForUri
, encodeUriString
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The module's name is already Url
, so, usage looks like Url.encode
. I think Url.encodeUri
would be a little too wordy.
By the way, one of the suggestions of this SPICE is to prefer "URL" over "URI".
---- | ||
import "pkl:Url" | ||
|
||
myUrl: Url = new { // <1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse
will throw (or return null
) on an invalid URL. What happens when you create an invalid URL with new
? Will it throw?
We could have a UrlBuilder
that allows you to create a URL with new and call toUrl(orNull)
to validate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can ensure that you cannot construct invalid Urls through property types and constraints.
@StefMa: this API is taken from the URL spec, see: https://url.spec.whatwg.org/#urlsearchparams By the way, Java's stdlib doesn't have a library for parsing the query. However, some other languages do, and call them different things:
Technically, the query string portion of a URL is just a string (no structure). Representing structured data using |
Documenting something here that we spoke about: will merging this into |
From an ergonomic standpoint, I would hope that Pkl's Url type produces language/stdlib-native URL types in each target language. Pretty sure would require changes in both the code generators and the Unmarshal/Decoder implementations. |
Good point; I need to fill in some details about language bindings. And, I agree, it should probably turn into the target language's URL type. |
I wonder if it makes sense to expand |
Co-authored-by: Islon Scherer <islonscherer@gmail.com>
Co-authored-by: Islon Scherer <islonscherer@gmail.com>
We possibly want to make |
.pkl.Url | ||
[source,pkl] | ||
---- | ||
module pkl.Url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would a URI like env:USER
be represented by this module?
In go, this is parsed as
url.URL{
Scheme: "env",
Opaque: "USER",
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be parsed as new URL { scheme = "env"; path = "USER" }
.
See https://url.spec.whatwg.org/#example-url-components
Actually, I wonder how we can better support opaque URLs. The env URL env://foo/bar
represents an env var with value //foo/bar
, and doesn't mean "the host is foo and the path is /bar".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to proffer Go's URL implementation as a carrot and Swift's as the stick:
Go
Go's url.Parse
function does (roughly) this:
- Split off the fragment, if present, and decode/store it
- Split off the scheme, if present
- Split off the query, if present, and store it
- If a scheme is present and the character immediately following the
:
is not/
, store the remainder inOpaque
and return - Otherwise, proceed to parse the authority and path
url.URL.String
produces a string from a URL struct. When Opaque
is non-empty, the result has form scheme:opaque?query#fragment
, otherwise it's scheme://userinfo@host/path?query#fragment
.
In practice, I've found this API to be extremely usable and flexible enough for working with the varied URIs found in the Pkl ecosystem. This may be a good design to reference that departs somewhat from the literal spec in the name of usability.
Swift
Having worked with pkl-swift, I've found Foundation URL type's lack of similar support makes working with opaque URIs like env:USER
significantly less ergonomic. Parsing just the "USER"
out requires error-prone mangling of the full URL string, eg. https://github.com/apple/pkl-swift/blob/c280a0e6324097c429bfc19a2cfe5761ab12bbe9/docs/modules/ROOT/pages/external-readers.adoc?plain=1#L33) to get the equivalent of Go's url.URL.Opaque
.
Pkl
Given that opaque URIs are already commonplace in the Pkl ecosystem (and used for several of the runtime's built-in resources), I think having first-class support for them in this proposal is necessary. A similar model to Go's could be adopted, but with added type constraints so ensure that opaque
is never non-null at the same time as hostname/
port/
path`.
A case like env://foo/bar
(or env:/foo/bar
) is still tricky (and go parses them "wrong"). I wonder if it would be reasonable to do something like this:
opaque: String = "<rendered authority/path>"
Url.toString()
would then use this field and ignore the authority/path field. This would allow direct construction of the Url
to directly set opaque
while still maintaining the same behavior as non-opaque URLs. Similarly, during parsing, if an authority/path are found (same criteria as Go) then those fields are populated, otherwise opaque
is set directly.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You actually need to do wrangling too with Go's url
API for opaque URLs. For example, to turn a URL like env://foo%20/bar
into //foo /bar
, you'd need:
func getSchemeSpecificPart(u *url.URL) (string, error) {
return url.PathUnescape(strings.Split(u.String(), ":")[1])
}
Whether a URL is opaque or not is really up to the scheme, and a simple rule like "if the char after the colon is /
, it's opaque" doesn't feel good enough.
No description provided.