-
Notifications
You must be signed in to change notification settings - Fork 343
fix: Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 #406
Conversation
8a68a51
to
ed438c3
Compare
…ned as Unreserved Characters in RFC 3986
ed438c3
to
672a6a1
Compare
@@ -80,7 +80,7 @@ public function validate($string, $config, $context) | |||
// as per RFC 3696, the top label need only not be all numeric. | |||
// The productions describing this are: | |||
$a = '[a-z]'; // alpha | |||
$an = '[a-z0-9]'; // alphanum | |||
$an = "[a-z0-9$underscore]"; // alphanum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update the comment above too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be ok to update the comment as follows?
$an = "[a-z0-9$underscore]"; // alphanum | "_"
$and = "[a-z0-9-$underscore]"; // alphanum | "-" | "_"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, I meant
// This doesn't match I18N domain names, but we don't have proper IRI support,
// so force users to insert Punycode.
// There is not a good sense in which underscores should be
// allowed, since it's technically not! (And if you go as
// far to allow everything as specified by the DNS spec...
// well, that's literally everything, modulo some space limits
// for the components and the overall name (which, by the way,
// we are NOT checking!). So we (arbitrarily) decide this:
// let's allow underscores wherever we would have allowed
// hyphens, if they are enabled. This is a pretty good match
// for browser behavior, for example, a large number of browsers
// cannot handle foo_.example.com, but foo_bar.example.com is
// fairly well supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I got it.
I have updated the comment. Please check it.
// Underscores defined as Unreserved Characters in RFC 3986 are
// allowed in a URI. There are cases where we want to consider a
// URI containing "_" such as "_dmarc.example.com".
// Underscores are not allowed in the default. If you want to
// allow it, set Core.AllowHostnameUnderscore to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your confirmation and merging!
0100c1c
to
c8b4404
Compare
c8b4404
to
dbe18a8
Compare
# [4.18.0](v4.17.0...v4.18.0) (2024-11-01) ### Bug Fixes * Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 ([#406](#406)) ([d9fbef8](d9fbef8)) * Avoid a deprecated error when the attribute name is numeric and DirectLex is used ([#412](#412)) ([f0fbf51](f0fbf51)) * checking that node has property name ([#399](#399)) ([9ca5a36](9ca5a36)) * Ignore conditional comments ([#401](#401)) ([4828fdf](4828fdf)) * Support PHP 8.4 ([#396](#396)) ([92da247](92da247)) * undefined array key warning ([#419](#419)) ([01be377](01be377)) ### Features * Add allowfullscreen attr for iframe ([#411](#411)) ([70754a2](70754a2)) * add directive for removing blank nodes ([#404](#404)) ([c9d60c9](c9d60c9)) * Add support for CSS aspect-ratio ([#408](#408)) ([93bee73](93bee73)) * Allow universal CSS values for all properties ([#410](#410)) ([9723267](9723267))
🎉 This PR is included in version 4.18.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Hello,
Thank you so much for developing and maintaining HTML Purifier.
I have created a pull request for #405.
Recently, we have seen cases where we want to consider URIs that contain "_", such as "_dmarc.example.com".
I think this PR adjusts Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986.
Thank you.