---
_id: '46312'
abstract:
- lang: eng
  text: Abuse and hate are penetrating social media and many comment sections of news
    media companies. These platform providers invest considerable efforts to mod-
    erate user-generated contributions to prevent losing readers who get appalled
    by inappropriate texts. This is further enforced by legislative actions, which
    make non-clearance of these comments a punishable action. While (semi-)automated
    solutions using Natural Language Processing and advanced Machine Learning techniques
    are getting increasingly sophisticated, the domain of abusive language detection
    still struggles as large non-English and well-curated datasets are scarce or not
    publicly available. With this work, we publish and analyse the largest annotated
    German abusive language comment datasets to date. In contrast to existing datasets,
    we achieve a high labelling standard by conducting a thorough crowd-based an-
    notation study that complements professional moderators’ decisions, which are
    also included in the dataset. We compare and cross-evaluate the performance of
    baseline algorithms and state-of-the-art transformer-based language models, which
    are fine-tuned on our datasets and an existing alternative, showing the usefulness
    for the community.
author:
- first_name: Dennis
  full_name: Assenmacher, Dennis
  last_name: Assenmacher
- first_name: Marco
  full_name: Niemann, Marco
  last_name: Niemann
- first_name: Kilian
  full_name: Müller, Kilian
  last_name: Müller
- first_name: Moritz
  full_name: Seiler, Moritz
  id: '105520'
  last_name: Seiler
- first_name: Dennis M.
  full_name: Riehle, Dennis M.
  last_name: Riehle
- first_name: Heike
  full_name: Trautmann, Heike
  id: '100740'
  last_name: Trautmann
  orcid: 0000-0002-9788-8282
citation:
  ama: 'Assenmacher D, Niemann M, Müller K, Seiler M, Riehle DM, Trautmann H. RP-Mod
    &#38; RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets. In:
    <i>Proceedings of the Neural Information Processing Systems Track on Datasets
    and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021)</i>. ; 2021:1–14.'
  apa: 'Assenmacher, D., Niemann, M., Müller, K., Seiler, M., Riehle, D. M., &#38;
    Trautmann, H. (2021). RP-Mod &#38; RP-Crowd: Moderator- and Crowd-Annotated German
    News Comment Datasets. <i>Proceedings of the Neural Information Processing Systems
    Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021)</i>,
    1–14.'
  bibtex: '@inproceedings{Assenmacher_Niemann_Müller_Seiler_Riehle_Trautmann_2021,
    place={Virtual Event}, title={RP-Mod &#38; RP-Crowd: Moderator- and Crowd-Annotated
    German News Comment Datasets}, booktitle={Proceedings of the Neural Information
    Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks
    2021)}, author={Assenmacher, Dennis and Niemann, Marco and Müller, Kilian and
    Seiler, Moritz and Riehle, Dennis M. and Trautmann, Heike}, year={2021}, pages={1–14}
    }'
  chicago: 'Assenmacher, Dennis, Marco Niemann, Kilian Müller, Moritz Seiler, Dennis
    M. Riehle, and Heike Trautmann. “RP-Mod &#38; RP-Crowd: Moderator- and Crowd-Annotated
    German News Comment Datasets.” In <i>Proceedings of the Neural Information Processing
    Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021)</i>,
    1–14. Virtual Event, 2021.'
  ieee: 'D. Assenmacher, M. Niemann, K. Müller, M. Seiler, D. M. Riehle, and H. Trautmann,
    “RP-Mod &#38; RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets,”
    in <i>Proceedings of the Neural Information Processing Systems Track on Datasets
    and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021)</i>, 2021, pp. 1–14.'
  mla: 'Assenmacher, Dennis, et al. “RP-Mod &#38; RP-Crowd: Moderator- and Crowd-Annotated
    German News Comment Datasets.” <i>Proceedings of the Neural Information Processing
    Systems Track on Datasets and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021)</i>,
    2021, pp. 1–14.'
  short: 'D. Assenmacher, M. Niemann, K. Müller, M. Seiler, D.M. Riehle, H. Trautmann,
    in: Proceedings of the Neural Information Processing Systems Track on Datasets
    and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021), Virtual Event, 2021,
    pp. 1–14.'
date_created: 2023-08-04T07:22:59Z
date_updated: 2024-06-07T07:13:04Z
department:
- _id: '34'
- _id: '819'
language:
- iso: eng
page: 1–14
place: Virtual Event
publication: Proceedings of the Neural Information Processing Systems Track on Datasets
  and Benchmarks 1 (NeurIPS Datasets and Benchmarks 2021)
status: public
title: 'RP-Mod & RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets'
type: conference
user_id: '15504'
year: '2021'
...
