8000 GitHub - roostorg/awesome-safety-tools: A curated collection of open source tools for online safety
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

roostorg/awesome-safety-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 

Repository files navigation

awesome-safety-tools

A curated collection of open source tools for online safety

Inspired by prior work like Awesome Redteaming and Awesome Phishing.

Help and contribute by adding a pull request to add more resources and tools!

Hash Matching

Classification

  • OSmod by Jigsaw
    • toolkit of machine learning (ML) tools, models, and APIs that platforms can use to moderate content
  • Perspective API by Jigsaw
    • machine learning-powered tool that helps platforms detect and assess the toxicity of online conversations
  • Presidio by Microsoft
    • toolset for detecting Personal Identifiable Information (PII) and other sensitive data in images and text
  • Llama Guard by Meta
    • AI-powered content moderation model to detect harm in text-based interactions
  • Llama Prompt Guard 2 by Meta
    • Detects prompt injection and jailbreaking attacks in LLM inputs.
  • Purple Llama by Meta
    • set of tools to assess and improve LLM security. Includes Llama Guard, CyberSec Eval, and Code Shield
  • ShieldGemma by Google DeepMind
    • AI safety toolkit by Google DeepMind designed to help detect and mitigate harmful or unsafe outputs in LLM applications
  • Roblox Voice Safety Classifier
    • machine learning model that detects and moderates harmful content in real-time voice chat on Roblox. Focuses on spoken language detection.
  • Detoxify by Unitary AI
    • detects and mitigates generalized toxic language (including hate speech, harassment, bullying) in text
  • Toxic Prompt RoBERTa by Intel
    • a BERT-based model for detecting toxic content in prompts to language models
  • NSFW Filtering
    • browser extension to block explicit images from online platforms. User facing.
  • NSFW Keras Model
    • convoluted neural network (CNN) based explicit image ML model
  • Guardrails AI
    • a Python framework that helps build safe AI applications checking input/output for predefined risks
  • Private Detector by Bumble
    • a pretrained model for detecting lewd images

Privacy Protection

Core Infrastructure

  • Mjolnir by Matrix
    • moderation bot for the Matrix protocol that automatically enforces content policies
  • AbuseIO
    • abuse management platform designed to help organizations handle and track abuse complaints related to online content, infrastructure, or services
  • Ozone by Bluesky
    • labeling tool designed for Bluesky. Includes moderation features to action on abuse flags, policy enforcement tools, and investigation features
  • Open Truss by Github
    • framework designed to help users create internal tools without needing to write code
  • Access by Discord
    • a centralized portal for managing access to internal systems within any organization

Redteaming Tools

Clustering

  • SpamAssassin by Apache
    • anti-spam platform that uses a variety of techniques, including text analysis, Bayesian filtering, and DNS blocklists, to classify and block unsolicited email
  • scikit-learn
    • python library including clustering through various algorithms, such as K-Means, DBSCAN, and hierarchical clustering

Rules Engines

  • RulesEngine by Microsoft
    • a library for abstracting business logic, rules, and policies from a system via JSON for .NET language families
  • Marble
    • a real-time fraud detection and compliance engine tailored for fintech companies and financial institutions
  • Automod by Bluesky
    • a tool for automating content moderation processes for the Bluesky social network and other apps on the AT Protocol
  • Wikimedia Smite Spam
    • an extension for MediaWiki that helps identify and manage spam content on a wiki
  • Druid by Apache
    • a high performance real-time analytics database

Review

  • RabbitMQ
    • a message broker that enables applications to communicate with each other by sending messages through queues
  • BullMQ
    • message queue and batch processing for NodeJS and Python based on Redis
  • Owlculus
    • an OSINT (Open-Source Intelligence) toolkit and case management platform
  • NCMEC Reporting by ello
    • a Ruby client library for reporting incidents to the National Center for Missing & Exploited Children (NCMEC) CyberTipline

Investigation

Safety Datasets

Red Teaming Datasets

Fediverse

  • FediCheck

    • domain moderation tool to assist ActivityPub service providers, such as Mastodon servers, now open-sourced.
  • Fediverse Spam Filtering

    • a spam filter for Fediverse social media platforms. For now, the current version is only a proof of concept.
  • FIRES

    • reference server + protocol for the exchange of moderation adivsories and recommendations

User Safety Tools

  • Uli by Tattle
    • Software and Resources for Mitigating Online Gender Based Violence in India

About

A curated collection of open source tools for online safety

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0