8000 fix: Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 by matsuo · Pull Request #406 · ezyang/htmlpurifier · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix: Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 #406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 19, 2024

Conversation

matsuo
Copy link
Contributor
@matsuo matsuo commented Apr 17, 2024

Hello,
Thank you so much for developing and maintaining HTML Purifier.

I have created a pull request for #405.
Recently, we have seen cases where we want to consider URIs that contain "_", such as "_dmarc.example.com".

I think this PR adjusts Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986.

Thank you.

@matsuo matsuo force-pushed the adjust-core-allow-hostname-underscore branch from 8a68a51 to ed438c3 Compare April 17, 2024 16:10
@matsuo matsuo changed the title Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 Fix Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 Apr 17, 2024
@matsuo matsuo force-pushed the adjust-core-allow-hostname-underscore branch from ed438c3 to 672a6a1 Compare April 17, 2024 16:12
@matsuo matsuo changed the title Fix Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 fix: Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 Apr 17, 2024
@@ -80,7 +80,7 @@ public function validate($string, $config, $context)
// as per RFC 3696, the top label need only not be all numeric.
// The productions describing this are:
$a = '[a-z]'; // alpha
$an = '[a-z0-9]'; // alphanum
$an = "[a-z0-9$underscore]"; // alphanum
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the comment above too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be ok to update the comment as follows?

        $an  = "[a-z0-9$underscore]";  // alphanum | "_"
        $and = "[a-z0-9-$underscore]"; // alphanum | "-" | "_"

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I meant

        // This doesn't match I18N domain names, but we don't have proper IRI support,
        // so force users to insert Punycode.
        // There is not a good sense in which underscores should be
        // allowed, since it's technically not! (And if you go as
        // far to allow everything as specified by the DNS spec...
        // well, that's literally everything, modulo some space limits
        // for the components and the overall name (which, by the way,
        // we are NOT checking!).  So we (arbitrarily) decide this:
        // let's allow underscores wherever we would have allowed
        // hyphens, if they are enabled.  This is a pretty good match
        // for browser behavior, for example, a large number of browsers
        // cannot handle foo_.example.com, but foo_bar.example.com is
        // fairly well supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I got it.
I have updated the comment. Please check it.

        // Underscores defined as Unreserved Characters in RFC 3986 are
        // allowed in a URI. There are cases where we want to consider a
        // URI containing "_" such as "_dmarc.example.com".
        // Underscores are not allowed in the default. If you want to
        // allow it, set Core.AllowHostnameUnderscore to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your confirmation and merging!

matsuo added a commit to matsuo/htmlpurifier that referenced this pull request Apr 18, 2024
matsuo added a commit to matsuo/htmlpurifier that referenced this pull request Apr 18, 2024
@matsuo matsuo force-pushed the adjust-core-allow-hostname-underscore branch from 0100c1c to c8b4404 Compare April 18, 2024 20:54
@matsuo matsuo force-pushed the adjust-core-allow-hostname-underscore branch from c8b4404 to dbe18a8 Compare April 18, 2024 20:56
ezyang
@ezyang ezyang merged commit d9fbef8 into ezyang:master Apr 19, 2024
@matsuo matsuo deleted the adjust-core-allow-hostname-underscore branch April 19, 2024 07:53
github-actions bot pushed a commit that referenced this pull request Nov 1, 2024
# [4.18.0](v4.17.0...v4.18.0) (2024-11-01)

### Bug Fixes

* Adjust Core.AllowHostnameUnderscore to consider that "_" is defined as Unreserved Characters in RFC 3986 ([#406](#406)) ([d9fbef8](d9fbef8))
* Avoid a deprecated error when the attribute name is numeric and DirectLex is used ([#412](#412)) ([f0fbf51](f0fbf51))
* checking that node has property name ([#399](#399)) ([9ca5a36](9ca5a36))
* Ignore conditional comments ([#401](#401)) ([4828fdf](4828fdf))
* Support PHP 8.4 ([#396](#396)) ([92da247](92da247))
* undefined array key warning ([#419](#419)) ([01be377](01be377))

### Features

* Add allowfullscreen attr for iframe ([#411](#411)) ([70754a2](70754a2))
* add directive for removing blank nodes ([#404](#404)) ([c9d60c9](c9d60c9))
* Add support for CSS aspect-ratio ([#408](#408)) ([93bee73](93bee73))
* Allow universal CSS values for all properties ([#410](#410)) ([9723267](9723267))
Copy link
github-actions bot commented Nov 1, 2024

🎉 This PR is included in version 4.18.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0