Detect All Abuse! Toward Universal Abusive Language Detection Models

This repository contains code for paper Detect All Abuse! Toward Universal Abusive Language Detection Models

Wang, K., Lu, D., Han, S. C., Long, S., & Poon, J. (2020)
Detect All Abuse! Toward Universal Abusive Language Detection Models
In Proceedings of the 28th International Conference on Computational Linguistics 2020 (pp. 6366-6376)

Embeddings

1. Direct Abuse Embedding

The code is mainly from https://github.com/kamalkraj/Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs

1.1 Download Dataset For NER Model

Download 3 files from https://github.com/kamalkraj/Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs/tree/master/data

train.txt
test.txt
val.txt

1.2 Download Glove embeddings

Download glove embeddings and unzip it: http://nlp.stanford.edu/data/glove.6B.zip

glove.6B.100d.txt

1.3 Train the Model and Make Prediction on the target Dataset

Please Note:

Use Direct Abuse Embedding to generate D embedding
Change the MAX_LEN to the max length of your target dataset
"sent_text" variable should be a list of original sentences

2. Generalized Abuse Embedding

Download the file from https://drive.google.com/file/d/152264axxTfmuYfb_7oWYQJFCggt06CEC/view?usp=sharing

3. Explicit Abuse Embedding

Download the file from https://drive.google.com/file/d/1059cRocqijTNzrl0UOXkFnngqpEZ54c1/view?usp=sharing

4. Implicit Abuse Embedding

4.1 Generate Implicit Input

Use Sarcasm Embedding Input to generate Implicit Input

4.2 Generate Implicit Embedding

After running Sarcasm Embedding, copy and paste the embedding into a text file "sarcasm_embedding.txt"

5. User Linguistic Behavior Embedding

Use Linguistic Behavior to get User Linguistic Behavior Embedding
Change the sentences and raw labels

Final Model

Use Final Model to do the final model training and prediction
Use the target dataset to fill in the "sentence_list" and "label_list"
Change the "seq_length" based on your choice
Change the "use_gcn" based on whether you want to use User Linguistic Behavior Embedding

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Directed Abuse Embedding.ipynb		Directed Abuse Embedding.ipynb
Final Model.ipynb		Final Model.ipynb
Generate Input for Sarcasm Model.ipynb		Generate Input for Sarcasm Model.ipynb
README.md		README.md
Sarcasm_embedding.ipynb		Sarcasm_embedding.ipynb
User Linguistic Behavior Embedding.ipynb		User Linguistic Behavior Embedding.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detect All Abuse! Toward Universal Abusive Language Detection Models

Wang, K., Lu, D., Han, S. C., Long, S., & Poon, J. (2020)
Detect All Abuse! Toward Universal Abusive Language Detection Models
In Proceedings of the 28th International Conference on Computational Linguistics 2020 (pp. 6366-6376)

Embeddings

1. Direct Abuse Embedding

1.1 Download Dataset For NER Model

1.2 Download Glove embeddings

1.3 Train the Model and Make Prediction on the target Dataset

2. Generalized Abuse Embedding

3. Explicit Abuse Embedding

4. Implicit Abuse Embedding

4.1 Generate Implicit Input

4.2 Generate Implicit Embedding

5. User Linguistic Behavior Embedding

Final Model

About

Releases

Packages

Languages

MatrixBlake/MACAS

Folders and files

Latest commit

History

Repository files navigation

Detect All Abuse! Toward Universal Abusive Language Detection Models

Wang, K., Lu, D., Han, S. C., Long, S., & Poon, J. (2020) Detect All Abuse! Toward Universal Abusive Language Detection ModelsIn Proceedings of the 28th International Conference on Computational Linguistics 2020 (pp. 6366-6376)

Embeddings

1. Direct Abuse Embedding

1.1 Download Dataset For NER Model

1.2 Download Glove embeddings

1.3 Train the Model and Make Prediction on the target Dataset

2. Generalized Abuse Embedding

3. Explicit Abuse Embedding

4. Implicit Abuse Embedding

4.1 Generate Implicit Input

4.2 Generate Implicit Embedding

5. User Linguistic Behavior Embedding

Final Model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Wang, K., Lu, D., Han, S. C., Long, S., & Poon, J. (2020)
Detect All Abuse! Toward Universal Abusive Language Detection Models
In Proceedings of the 28th International Conference on Computational Linguistics 2020 (pp. 6366-6376)

Packages