Vandalism Detection in Wikidata

S. Heindorf, M. Potthast, B. Stein, G. Engels, in: Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), 2016, pp. 327--336.

Download
Restricted 137-p327-heindorf.pdf 1.84 MB
Conference Paper | English
Author
Heindorf, StefanLibreCat ; Potthast, Matthias; Stein, Benno; Engels, GregorLibreCat
Abstract
Wikidata is the new, large-scale knowledge base of the Wikimedia Foundation. Its knowledge is increasingly used within Wikipedia itself and various other kinds of information systems, imposing high demands on its integrity.Wikidata can be edited by anyone and, unfortunately, it frequently gets vandalized, exposing all information systems using it to the risk of spreading vandalized and falsified information. In this paper, we present a new machine learning-based approach to detect vandalism in Wikidata.We propose a set of 47 features that exploit both content and context information, and we report on 4 classifiers of increasing effectiveness tailored to this learning task. Our approach is evaluated on the recently published Wikidata Vandalism Corpus WDVC-2015 and it achieves an area under curve value of the receiver operating characteristic, ROC-AUC, of 0.991. It significantly outperforms the state of the art represented by the rule-based Wikidata Abuse Filter (0.865 ROC-AUC) and a prototypical vandalism detector recently introduced by Wikimedia within the Objective Revision Evaluation Service (0.859 ROC-AUC).
Publishing Year
Proceedings Title
Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016)
Page
327--336
LibreCat-ID
137

Cite this

Heindorf S, Potthast M, Stein B, Engels G. Vandalism Detection in Wikidata. In: Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016). ; 2016:327--336. doi:10.1145/2983323.2983740
Heindorf, S., Potthast, M., Stein, B., & Engels, G. (2016). Vandalism Detection in Wikidata. Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), 327--336. https://doi.org/10.1145/2983323.2983740
@inproceedings{Heindorf_Potthast_Stein_Engels_2016, title={Vandalism Detection in Wikidata}, DOI={10.1145/2983323.2983740}, booktitle={Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016)}, author={Heindorf, Stefan and Potthast, Matthias and Stein, Benno and Engels, Gregor}, year={2016}, pages={327--336} }
Heindorf, Stefan, Matthias Potthast, Benno Stein, and Gregor Engels. “Vandalism Detection in Wikidata.” In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), 327--336, 2016. https://doi.org/10.1145/2983323.2983740.
S. Heindorf, M. Potthast, B. Stein, and G. Engels, “Vandalism Detection in Wikidata,” in Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), 2016, pp. 327--336, doi: 10.1145/2983323.2983740.
Heindorf, Stefan, et al. “Vandalism Detection in Wikidata.” Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), 2016, pp. 327--336, doi:10.1145/2983323.2983740.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
File Name
137-p327-heindorf.pdf 1.84 MB
Access Level
Restricted Closed Access
Last Uploaded
2018-03-21T13:01:43Z


Link(s) to Main File(s)
Access Level
Restricted Closed Access

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar