-
Notifications
You must be signed in to change notification settings - Fork 107
Adds sanitize_html
, a whitelist based HTML sanitizer.
#171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I am unsure how to fix this. |
* * attribute_whitelist_json: a json_encode()'d list of HTML attributes to allow in the final string. | ||
* * tag_whitelist_json: a json_encode()'d list of HTML tags to allow in the final string. | ||
*/ | ||
#define rustg_sanitize_html(text, attribute_whitelist_json, tag_whitelist_json) RUSTG_CALL(RUST_G, "sanitize_html")(text, attribute_whitelist_json, tag_whitelist_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semicolon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, am I missing something here?
* * attribute_whitelist_json: a json_encode()'d list of HTML attributes to allow in the final string. | ||
* * tag_whitelist_json: a json_encode()'d list of HTML tags to allow in the final string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface should take a list and json_encode in itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do this so that you can store pre-encoded global lists to save on perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that mean it's encoding on every call? The thing is that this will likely be called many times with only one or a few lists, so this introduces extra overhead.
.link_rel(Some("noopener")) // https://mathiasbynens.github.io/rel-noopener/ | ||
.url_schemes(prune_url_schemes) | ||
.generic_attributes(attribute_whitelist) | ||
.tags(tag_whitelist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it make sense to keep this around rather than build it anew on every invocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to hash the arguments and such and that's out of my skill set presently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks about right.
looks about right :+2: |
mods? mergies? @ZeWaka |
* Routine Update PR (tgstation#167) * Standardize redis_reliablequeue connect/disconnect output (tgstation#150) * More routine updates (tgstation#169) * IconForge - Building spritesheets at the speed of light (tgstation#160) * iconforge beta * Start blending * Huge cleanup * Finish optimizing the thing * Finish the thing!! * Clean up a bit * Re-add 32-bit thing * Fix TOML sorting * Add dmsrc * Fix clippy suggestions * Clippy.. stop being mean * Cargo fmt + doc comments * Code cleanup * More cleanup, remove most unsafe unwrap()s, use Match syntax. * Remove unneccesarily verbose casting * Fix overlay blending * Cleanup with new DMI version * Cargo fmt * Leaf 8000 test, DynamicImage->RgbaImage, better Error handling, DashMap, and cleanup command * Fix * Further tree optimizations, hashing optimization, cache icostrings more effectively. * Optimize unique_icons insertion a little * Fix macro * Little more cleanup * Add to README * Update dmi, add caching logic. * Address reviews * Cleanup panic unwind * Fix lint failure * Fix bounds expansion crops, and properly index crops from 1,1 * Don't multiply by alpha if the base alpha is 0 * Fix subtract blending * Don't hash the same DMI 500 times * Address reviews * Clippy fix * v3.2.0 (tgstation#170) * Adds `sanitize_html`, a whitelist based HTML sanitizer. (tgstation#171) * Adds batchnoise to the default features set (tgstation#174) * Typical Routine Updates (tgstation#175) * Add task for building on windows (tgstation#176) * v3.3.0 (tgstation#177) * Fast poisson sampling (tgstation#178) Co-authored-by: ZeWaka <zewakagamer@gmail.com> * Add format argument to git revdate ffi (tgstation#179) * Add method of parsing revdate for HEAD directly from logs (tgstation#180) * use lines not split (tgstation#181) * Windows 7 (tgstation#183) * Allow compiling non-32bit under feature flag (tgstation#184) Co-authored-by: ZeWaka <zewakagamer@gmail.com> * 32bit readme (tgstation#186) * v3.4.0 (tgstation#187) * Fix a panic in `byond::parse_args` with debug assertions (tgstation#189) * chore: routine updates (tgstation#190) * more assorted package updates because bored (tgstation#191) * last-minute updates (tgstation#193) * v3.5.0 (tgstation#194) * iconforge: Use height() for y axis to support non-square icons (tgstation#197) * Add building of x64 libs to CI (tgstation#200) * Add hash and iconforge as default features (tgstation#196) * IconForge: GAGS (tgstation#188) * 64 bit lib detection (tgstation#202) * update mysql crate, trims a lot of deps (tgstation#203) * Reset to correct versions * IconForge: Sort GAGS output states (tgstation#206) * IconForge: Improve GAGS frame/dir difference handling (tgstation#207) * gamer release workflow * v3.7.0 (tgstation#208) * Fix release upload paths * fully correct and rename files in CI/CD * Massively optimizes `dmi_icon_states` (tgstation#209) * Add support for timing out HTTP calls (tgstation#210) * v3.8.0 (tgstation#211) * fix default release name while i remember * Feature: rustg_sound_length() (tgstation#192) * update `rand` to `0.9`, `cargo update` (tgstation#204) * Adjust CI to match our workflow * Fix outdated upload-artifact version --------- Co-authored-by: ZeWaka <zewakagamer@gmail.com> Co-authored-by: Kapu1178 <75460809+Kapu1178@users.noreply.github.com> Co-authored-by: GoldenAlpharex <58045821+GoldenAlpharex@users.noreply.github.com> Co-authored-by: Fluffy <65877598+FluffyGhoster@users.noreply.github.com> Co-authored-by: Zephyr <12817816+ZephyrTFA@users.noreply.github.com> Co-authored-by: Mothblocks <35135081+Mothblocks@users.noreply.github.com> Co-authored-by: Kyle Spier-Swenson <kyleshome@gmail.com> Co-authored-by: Lucy <lucy@absolucy.moe> Co-authored-by: tigercat2000 <nick.pilant2@gmail.com> Co-authored-by: Amy <3855802+amylizzle@users.noreply.github.com> Co-authored-by: Jordan Dominion <Cyberboss@users.noreply.github.com>
Adds a customizable HTML sanitizer function using the Ammonia crate. Out of the box, it will:
By providing json encoded lists, you can whitelist given attributes or tags to not be pruned. I have included a curated tag list in the dm source file for this module that will whitelist most safe CSS attributes.
It occured to me that alot of servers run things like old papercode, which does not sanitize on the server side before being viewable by a client. Sanitizing strings with DM would be an absolute performance nuke, assuming you could even make it bulletproof in the first place.
Here is a recommended default tag whitelist