---
res:
  bibo_abstract:
  - 'In recent years, there has been a surge in natural language processing research
    focused on low-resource languages (LrLs), underscoring the growing recognition
    that LrLs deserve the same attention as high-resource languages (HrLs). This shift
    is crucial for ensuring linguistic diversity and inclusivity in the digital age.
    Despite Indonesian ranking as the 11th most spoken language globally, it remains
    under-resourced in terms of computational tools and datasets. Within the semantic
    web domain, Entity Linking (EL) is pivotal, linking textual entity mentions to
    their corresponding entries in knowledge bases. This process is foundational for
    advanced information extraction tasks, including relation extraction and event
    detection. To bolster EL research in Indonesian, we introduce IndEL, the first
    benchmark dataset tailored for both general and specific domains. IndEL was manually
    curated using Wikidata, adhering to a rigorous set of annotation guidelines. We
    used two Named Entity Recognition (NER) benchmark datasets for entity extraction:
    NER UI for the general domain and IndQNER for the specific domain. IndQNER focused
    on entities from the Indonesian translation of the Quran. IndEL comprises 4765
    entities in the general domain and 2453 in the specific domain. Using the GERBIL
    framework, we use IndEL to evaluate the performance of various EL systems, such
    as Babelfy, DBpedia Spotlight, MAG, OpenTapioca, and WAT. Our further investigation
    reveals that within Wikidata, a significant number of NIL entities remain unlinked
    due to the limited number of Indonesian labels and the use of acronyms. Especially
    in the specific domain, transliteration and translation processes performed to
    create the Indonesian translation of the Quran contribute to the presence of entities
    in a descriptive form and as synonyms.@eng'
  bibo_authorlist:
  - foaf_Person:
      foaf_givenName: Ria Hari
      foaf_name: Gusmita, Ria Hari
      foaf_surname: Gusmita
      foaf_workInfoHomepage: http://www.librecat.org/personId=71039
  - foaf_Person:
      foaf_givenName: Muhammad Faruq Amiral
      foaf_name: Abshar, Muhammad Faruq Amiral
      foaf_surname: Abshar
  - foaf_Person:
      foaf_givenName: Diego
      foaf_name: Moussallem, Diego
      foaf_surname: Moussallem
      foaf_workInfoHomepage: http://www.librecat.org/personId=71635
  - foaf_Person:
      foaf_givenName: Axel-Cyrille
      foaf_name: Ngonga Ngomo, Axel-Cyrille
      foaf_surname: Ngonga Ngomo
      foaf_workInfoHomepage: http://www.librecat.org/personId=65716
  bibo_doi: 10.1007/978-3-031-70239-6_34
  dct_date: 2024^xs_gYear
  dct_isPartOf:
  - http://id.crossref.org/issn/0302-9743
  - http://id.crossref.org/issn/1611-3349
  - http://id.crossref.org/issn/9783031702389
  - http://id.crossref.org/issn/9783031702396
  dct_language: eng
  dct_publisher: Springer Nature Switzerland@
  dct_subject:
  - entity linking benchmark dataset
  - Indonesian
  - general and specific domains
  dct_title: 'IndEL: Indonesian Entity Linking Benchmark Dataset for General and Specific
    Domains@'
...
