-
Waseem Racism and Sexism on Twitter Dataset
Dataset of racist and sexist tweets sampled from Twitter and labelled first by experts (including feminist and anti-racist activists), and then by CF amateur annotators who... -
Gao and Huang Hate Speech on Fox News Dataset
Dataset of 1528 annotated comments from Fox News website, taken from 10 news articles. Comments were labelled by experts in two stages, with an annotator agreement of Cohen's... -
Ross et al. Hate Speech Against Refugees
Dataset of German annotated corpus of tweets regarding refugees in Germany. Tweets were sampled using 10 hateful hashtags and labelled by experts with 2 annotators per tweet.... -
Sanguinetti et al. Italian Corpus of Hate Speech against Immigrants
Dataset of hateful tweets against immigrants, roma and muslims, sampled using keywords. 3,154 tweets were annotated by experts (2 per tweet) and then 2,855 annotated by CF (3+... -
de Pelle and Moreira Offensive Comments in the Brazilian Web from News Platform
Dataset of offensive comments by collecting comments in the Brazilian Web from g1.gobo.com, the most accessed news site in Brazil. Sampled by comments about politics and sports... -
CONAN: Multilingual Dataset of Responses to Fight Online Hate Speech
Dataset of pairs islamophobic hate speech and counter-responses with 3 types of metadata: expert demographics, hate speech sub-topic, counter-narrative type. The dataset is... -
Qian et al. from Dataset for Learning Intervene in Online Hate Speech Gab and...
Dataset of hateful/ not hateful posts in the context of conversations from Reddit and Gab. The data is annotated through crowd-sourcing with Amazon Mechanical Turk with... -
Waseem and Hovy Racism and Sexism on Twitter Dataset
Dataset of racist and sexist tweets sampled from Twitter and labelled by a mix of expert annotators and activists. Tweets were sampled in 2016 over 2 months using keywords.... -
HASOC 2019: Hate Speech and Offensive Content Identification in Indo-European...
Hate Speech Dataset for Hindi, German and English. Three datasets sampled from Twitter and Facebook sampled by topics, hashtags, other keywords and the timeline of users (last... -
de Gibert et al. Hate Speech from a White Supremacy Forum Dataset
Hate speech dataset composed of thousands of sentences extracted from Stormfront, a white supremacist forum, manually labelled by experts. Annotator agreement for the 1st round... -
Ousidhoum et al. Multilingual and Multi-Aspect Hate Speech Analysis on Twitter
Dataset of multi-aspect hate speech posts sampled from Twitter and labelled through crowd-sourcing (Amazon Mechanical Turk). Tweets were sampled by common slurs and demeaning... -
ElSherief et al. Hate Speech Instigators and Their Targets Dataset from Twitter
Dataset of hate speech and targets from Twitter collected through a multi-step classification process and annotated through CrowdFlower. 92.8% agreement among the annotators for... -
Parikh multi-label sexism
10 annotators initially annotated 20,000 entries from the Everyday Sexism Project. At least two annotators labelled every entry. Average Cohen's kappa for the per-category pairs... -
DKhate: Danish Hate Speech & Abusive Language data
Task description: Branching structure of tasks: Binary (Offensive, Not), Within Offensive (Target, Not), Within Target (Individual, Group, Other) Details of task:... -
Turkish OffensEval
The Turkish dataset used in OffensEval 2020