---
_id: '60990'
abstract:
- lang: eng
  text: 'Large Language Models (LLMs) have demonstrated remarkable performance across
    a wide range of natural language processing tasks. However, their effectiveness
    in low-resource languages remains underexplored, particularly in complex tasks
    such as end-to-end Entity Linking (EL), which requires both mention detection
    and disambiguation against a knowledge base (KB). In earlier work, we introduced
    IndEL — the first end-to-end EL benchmark dataset for the Indonesian language
    — covering both a general domain (news) and a specific domain (religious text
    from the Indonesian translation of the Quran), and evaluated four traditional
    end-to-end EL systems on this dataset. In this study, we propose ELEVATE-ID, a
    comprehensive evaluation framework for assessing LLM performance on end-to-end
    EL in Indonesian. The framework evaluates LLMs under both zero-shot and fine-tuned
    conditions, using multilingual and Indonesian monolingual models, with Wikidata
    as the target KB. Our experiments include performance benchmarking, generalization
    analysis across domains, and systematic error analysis. Results show that GPT-4
    and GPT-3.5 achieve the highest accuracy in zero-shot and fine-tuned settings,
    respectively. However, even fine-tuned GPT-3.5 underperforms compared to DBpedia
    Spotlight — the weakest of the traditional model baselines — in the general domain.
    Interestingly, GPT-3.5 outperforms Babelfy in the specific domain. Generalization
    analysis indicates that fine-tuned GPT-3.5 adapts more effectively to cross-domain
    and mixed-domain scenarios. Error analysis uncovers persistent challenges that
    hinder LLM performance: difficulties with non-complete mentions, acronym disambiguation,
    and full-name recognition in formal contexts. These issues point to limitations
    in mention boundary detection and contextual grounding. Indonesian-pretrained
    LLMs, Komodo and Merak, reveal core weaknesses: template leakage and entity hallucination,
    respectively—underscoring architectural and training limitations in low-resource
    end-to-end EL.11Code and dataset are available at https://github.com/dice-group/ELEVATE-ID.'
article_type: original
author:
- first_name: Ria Hari
  full_name: Gusmita, Ria Hari
  id: '71039'
  last_name: Gusmita
- first_name: Asep Fajar
  full_name: Firmansyah, Asep Fajar
  id: '76787'
  last_name: Firmansyah
- first_name: Hamada Mohamed Abdelsamee
  full_name: Zahera, Hamada Mohamed Abdelsamee
  id: '72768'
  last_name: Zahera
  orcid: 0000-0003-0215-1278
- first_name: Axel-Cyrille
  full_name: Ngonga Ngomo, Axel-Cyrille
  id: '65716'
  last_name: Ngonga Ngomo
citation:
  ama: 'Gusmita RH, Firmansyah AF, Zahera HMA, Ngonga Ngomo A-C. ELEVATE-ID: Extending
    Large Language Models for End-to-End Entity Linking Evaluation in Indonesian.
    <i>Data &#38; Knowledge Engineering</i>. 2026;161:102504. doi:<a href="https://doi.org/10.1016/j.datak.2025.102504">https://doi.org/10.1016/j.datak.2025.102504</a>'
  apa: 'Gusmita, R. H., Firmansyah, A. F., Zahera, H. M. A., &#38; Ngonga Ngomo, A.-C.
    (2026). ELEVATE-ID: Extending Large Language Models for End-to-End Entity Linking
    Evaluation in Indonesian. <i>Data &#38; Knowledge Engineering</i>, <i>161</i>,
    102504. <a href="https://doi.org/10.1016/j.datak.2025.102504">https://doi.org/10.1016/j.datak.2025.102504</a>'
  bibtex: '@article{Gusmita_Firmansyah_Zahera_Ngonga Ngomo_2026, title={ELEVATE-ID:
    Extending Large Language Models for End-to-End Entity Linking Evaluation in Indonesian},
    volume={161}, DOI={<a href="https://doi.org/10.1016/j.datak.2025.102504">https://doi.org/10.1016/j.datak.2025.102504</a>},
    journal={Data &#38; Knowledge Engineering}, author={Gusmita, Ria Hari and Firmansyah,
    Asep Fajar and Zahera, Hamada Mohamed Abdelsamee and Ngonga Ngomo, Axel-Cyrille},
    year={2026}, pages={102504} }'
  chicago: 'Gusmita, Ria Hari, Asep Fajar Firmansyah, Hamada Mohamed Abdelsamee Zahera,
    and Axel-Cyrille Ngonga Ngomo. “ELEVATE-ID: Extending Large Language Models for
    End-to-End Entity Linking Evaluation in Indonesian.” <i>Data &#38; Knowledge Engineering</i>
    161 (2026): 102504. <a href="https://doi.org/10.1016/j.datak.2025.102504">https://doi.org/10.1016/j.datak.2025.102504</a>.'
  ieee: 'R. H. Gusmita, A. F. Firmansyah, H. M. A. Zahera, and A.-C. Ngonga Ngomo,
    “ELEVATE-ID: Extending Large Language Models for End-to-End Entity Linking Evaluation
    in Indonesian,” <i>Data &#38; Knowledge Engineering</i>, vol. 161, p. 102504,
    2026, doi: <a href="https://doi.org/10.1016/j.datak.2025.102504">https://doi.org/10.1016/j.datak.2025.102504</a>.'
  mla: 'Gusmita, Ria Hari, et al. “ELEVATE-ID: Extending Large Language Models for
    End-to-End Entity Linking Evaluation in Indonesian.” <i>Data &#38; Knowledge Engineering</i>,
    vol. 161, 2026, p. 102504, doi:<a href="https://doi.org/10.1016/j.datak.2025.102504">https://doi.org/10.1016/j.datak.2025.102504</a>.'
  short: R.H. Gusmita, A.F. Firmansyah, H.M.A. Zahera, A.-C. Ngonga Ngomo, Data &#38;
    Knowledge Engineering 161 (2026) 102504.
date_created: 2025-08-24T11:38:51Z
date_updated: 2025-08-25T09:40:13Z
department:
- _id: '574'
doi: https://doi.org/10.1016/j.datak.2025.102504
intvolume: '       161'
keyword:
- LLMs
- Evaluation
- End-to-end EL
- Indonesian
language:
- iso: eng
main_file_link:
- url: https://www.sciencedirect.com/science/article/pii/S0169023X25000990?utm_campaign=STMJ_220042_AUTH_SERV_PA&utm_medium=email&utm_acid=78351008&SIS_ID=&dgcid=STMJ_220042_AUTH_SERV_PA&CMX_ID=&utm_in=DM591673&utm_source=AC_
page: '102504'
publication: Data & Knowledge Engineering
publication_identifier:
  issn:
  - 0169-023X
status: public
title: 'ELEVATE-ID: Extending Large Language Models for End-to-End Entity Linking
  Evaluation in Indonesian'
type: journal_article
user_id: '71039'
volume: 161
year: '2026'
...
