An evaluation framework for cross-lingual link discovery

Tang, Ling-Xiang, Geva, Shlomo, Trotman, Andrew, Xu, Yue, & Itakura, Kelly (2013) An evaluation framework for cross-lingual link discovery. Information Processing & Management, 50(1), pp. 1-23.

[img] Accepted Version (PDF 2MB)
Administrators only until January 2017 | Request a copy from author

View at publisher


Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9.

This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.

Impact and interest:

0 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 61914
Item Type: Journal Article
Refereed: Yes
Keywords: Assessment, Cross-lingual link discovery, Evaluation framework, Evaluation metrics, Validation, Wikipedia
DOI: 10.1016/j.ipm.2013.07.003
ISSN: 0306-4573
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2013 Elsevier Ltd.
Copyright Statement: This is the author’s version of a work that was accepted for publication in Information Processing & Management. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information Processing & Management, [VOL 50, ISSUE 1, (2013)] DOI: 10.1016/j.ipm.2013.07.003
Deposited On: 19 Aug 2013 23:47
Last Modified: 14 Jul 2015 13:32

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page