-
Founta et al. Hate and Abusive Speech on Twitter
Dataset of tweets collected from 30th March 2017 to 9th April 2017 with a boosted random sampling technique, by using text analysis and preliminary crowdsourcing rounds to... -
Davidson et al. Crowd-sourced Hate Speech On Twitter Dataset
Dataset of hateful tweets sampled from Twitter using keywords. Labelled by Crowdflower, 3+ people annotated each tweet. Majority decision was taken with 92% annotator agreement. -
Breitfeller et al. Microaggressions Dataset
Dataset of self-reported microaggressions from microaggressions.com. 2,934 posts were collected targeted towards gender (1,314 posts), race (1,278 posts), sexuality (461 posts),... -
Fortuna et al. A Hierarchically-Labeled Portugese Hate Speech Dataset From Tw...
Dataset contains hate speech in Portuguese sampled from Twitter with 81 categories. The dataset is manually annotated for Hate Speech using a hierarchical structure of classes.... -
Bretschneider and Peters Prejudice on Facebook Dataset
Dataset of Facebook posts and comments published in response to them from the Facebook pages “Pegida” (dataset 1), “Ich bin Patriot, aber kein Nazi” (“I’m a patriot, not a... -
Albadi et al. Arabic Religious Hate on Twitter
Dataset of Arabic religious hate tweets sampled using neutral religious names as keywords. Annotation was crowdsourced using CrowdFlower, with a minimum of 3 annotations per... -
Waseem Racism and Sexism on Twitter Dataset
Dataset of racist and sexist tweets sampled from Twitter and labelled first by experts (including feminist and anti-racist activists), and then by CF amateur annotators who... -
Qian et al. from Dataset for Learning Intervene in Online Hate Speech Gab and...
Dataset of hateful/ not hateful posts in the context of conversations from Reddit and Gab. The data is annotated through crowd-sourcing with Amazon Mechanical Turk with...