-
Wulczyn et al. Personal Attacks on Wikipedia Dataset
Dataset of hateful Wikipedia comments. The sampling of the data was a combination of random + oversampled on banned comments. Annotation was crowdsourced, and each comment was... -
Caselli et al. Implicit/Explicit Expansion on OLID
This dataset expands the OLID/OffensEval (OLID (Zampieri et al., 2019a), Offensive Language Identification Dataset) by adding the explicitness of the message. The OLID data was... -
Jha and Mamidi Sexism on Twitter Dataset
Dataset of sexist tweets sampling based on benevolent sexist key phrases from which 712 tweets were manually selected by the authors, and validated by three non-activist...