Description
Description
The allowed charsets in regex-assembly/include/allowed-charsets.ra synced with tx.allowed_request_content_type_charset appears to be overly restrictive and I couldn't find any documentation or discussion on why only these 4 charsets are allowed.
iso-8859-1
iso-8859-15
utf-8
windows-1252
For example, utf-16 is not allowed but is a relatively common and valid charset.
I am wondering what the justification is for the decision to only include these 4 charsets. As shown, there are a lot more valid charsets than just these 4: https://www.iana.org/assignments/character-sets/character-sets.xhtml
Reproduction
Accept Headers charset - triggering rule 920600 with valid charset utf-16
utf-8 does not trigger rule
utf-16 triggers rule
Content-Type charset - triggering rule 920480 with valid charset utf-16
Your Environment
Shown above in Coraza WAF Sandbox with CRS, also triggered in a prod environment using CRS 4.3.0 in coraza-proxy-wasm.
Confirmation
[x] I have removed any personal data (email addresses, IP addresses,
passwords, domain names) from any logs posted.