No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media
M. Spliethöver, M. Keiff, H. Wachsmuth, in: Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Association for Computational Linguistics, 2022.
Download
No fulltext has been uploaded.
Conference Paper
| English
Author
Spliethöver, Maximilian;
Keiff, Maximilian;
Wachsmuth, Henning
Department
Abstract
News articles both shape and reflect public opinion across the political
spectrum. Analyzing them for social bias can thus provide valuable insights,
such as prevailing stereotypes in society and the media, which are often
adopted by NLP models trained on respective data. Recent work has relied on
word embedding bias measures, such as WEAT. However, several representation
issues of embeddings can harm the measures' accuracy, including low-resource
settings and token frequency differences. In this work, we study what kind of
embedding algorithm serves best to accurately measure types of social bias
known to exist in US online news articles. To cover the whole spectrum of
political bias in the US, we collect 500k articles and review psychology
literature with respect to expected social bias. We then quantify social bias
using WEAT along with embedding algorithms that account for the aforementioned
issues. We compare how models trained with the algorithms on news articles
represent the expected social bias. Our results suggest that the standard way
to quantify bias does not align well with knowledge from psychology. While the
proposed algorithms reduce the~gap, they still do not fully match the
literature.
Publishing Year
Proceedings Title
Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
Conference
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
Conference Location
Abu Dhabi
Conference Date
2022-12-07 – 2022-12-11
LibreCat-ID
Cite this
Spliethöver M, Keiff M, Wachsmuth H. No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media. In: Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). Association for Computational Linguistics; 2022.
Spliethöver, M., Keiff, M., & Wachsmuth, H. (2022). No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media. Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi.
@inproceedings{Spliethöver_Keiff_Wachsmuth_2022, title={No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media}, booktitle={Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)}, publisher={Association for Computational Linguistics}, author={Spliethöver, Maximilian and Keiff, Maximilian and Wachsmuth, Henning}, year={2022} }
Spliethöver, Maximilian, Maximilian Keiff, and Henning Wachsmuth. “No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media.” In Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). Association for Computational Linguistics, 2022.
M. Spliethöver, M. Keiff, and H. Wachsmuth, “No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media,” presented at the The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, 2022.
Spliethöver, Maximilian, et al. “No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media.” Proceedings of The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Association for Computational Linguistics, 2022.