Persian Swear Dataset - you can use in your production to filter unwanted content. دیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها
-
Updated
Sep 30, 2024 - C#
8000
Persian Swear Dataset - you can use in your production to filter unwanted content. دیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
A list of Romanian NLP Datasets
A meta enriched data set of German parliamental debates covering 74 years of plenary protocols.
Dataset for web-scaled information extraction.
Get a pragmatic assessment how understandable a German text is.
Persian Slang Words (dataset)
Persian sms dataset
Persian News Dataset
This repo is the dataset for the paper "A New Dataset and Methodology for Malicious URL Classification"
Repository for the LREC-COLING 2024 Paper: Persona-Based Corpus in the Diabetes Mellitus Domain – Applying a Human-Centered Approach to a Low-Resource Context
Dataset with annotation of Russian-language poems
Parallel Literary Corpora: Fiction and Poetry Translations
a novel Romanian language dataset for offensive message detection with manually annotated comment from a local Romanian news website (stiri de cluj) into five classes
RO-Offense: A Novel Romanian Dataset for Offensive Language in Online Comments
Add a description, image, and links to the nlp-dataset topic page so that developers can more easily learn about it.
To associate your repository with the nlp-dataset topic, visit your repo's landing page and select "manage topics."