View dataset Edit metadata Resources Note: This dataset is awaiting approval from an Office Administrator. Changes made now won't be public until the dataset is approved. Dataset name * URL /dataset/ Description 10 annotators initially annotated 20,000 entries from the Everyday Sexism Project. At least two annotators labelled every entry. Average Cohen's kappa for the per-category pairs is 0.58. Then, cases where annotators disagreed were sent for further review. After cleaning, the final dataset comprises 13,023 entries. 23 categories were labelled, which were merged to 14 categories for machine learning. Entries with fewer than 7 words were excluded from the dataset. You can use Markdown formatting here Paper Authors Author contact email Publication / paper reference Publication / paper link Publication Year Dataset about page License Creative Commons Attribution Creative Commons Attribution Share-Alike Creative Commons CCZero Creative Commons Non-Commercial (Any) GNU Free Documentation License License not specified Open Data Commons Attribution License Open Data Commons Open Database License (ODbL) Open Data Commons Public Domain Dedication and License (PDDL) Other (Attribution) Other (Non-Commercial) Other (Not Open) Other (Open) Other (Public Domain) UK Open Government Licence (OGL) License definitions and additional information can be found at http://opendefinition.org/ Language(s) covered Source data platform(s) Phenomena annotated Level of instances Conversation thread Other Single comment / post Topic User Data statement link Total number of instances in dataset Proportion of positive/abusive instances Submitter Submitter Email The data license you select above only applies to the contents of any resource files that you add to this dataset. By submitting this form, you agree to release the metadata values that you enter into the form under the Open Database License. Update Dataset * Required field