Organizers
Eneko Agirre (e.agirre@ehu.es)
Bernardo Magnini (magnini@itc.it)
German Rigau (g.rigau@si.ehu.es)
Piek Vossen (Piek.Vossen@irion.nl)
The community has long mentioned the necessity of evaluating WSD in an application, in order to check which WSD strategy is best, and more important, to try to show that WSD can make a difference in applications. The use of WSD in Machine Translation has been the subject of some recent papers, but less attention has been paid to Information Retrieval (IR).
With this proposal we want to make a first try to define a task where WSD is evaluated with respect to an Information Retrieval and Cross-Lingual Information Retrieval (CLIR) exercise. From the WSD perspective, this task will evaluate all-words WSD systems indirectly on a real task. From the CLIR perspective, this task will evaluate which WSD systems and strategies work best.
We are conscious that the number of possible configurations for such an exercise is very large (including sense inventory choice, using word sense induction instead of disambiguation, query expansion, WSD strategies, IR strategies, etc.), so we wanted to focus this first edition in the following way:
1) the IR/CLIR system is fixed 2) the expansion / translation strategy is fixed 3) the participants need to choose the best WSD strategy 4) the IR system is used as the upperbound for the CLIR systems.
We think that it is important to start doing this kind of application-driven evaluations, which might shed light to intricacies of the interaction among WSD and IR strategies. We see this as the first of a series of exercises, and one outcome of this task should be that both WSD and CLIR communities discuss together future evaluation possibilities.
In fact, CLEF-2008 will have a special track with the same data, where CLIR systems will have the opportunity of using the annotated data produced as a result of the Semeval-2007 task.
The results of the task will be presented both in the Semeval workshop (ACL 2007) and the CLEF conference.
Description of the task
This is an application-driven task, where the application is a fixed cross-lingual information retrieval system. Participants disambiguate text by assigning WordNet synsets and the system will do the expansion to other languages, index the expanded documents and run the retrieval for all the languages in batch. The retrieval results will be taken as a measure for fitness of the disambiguation. The modules and rules for the expansion and the retrieval will be exactly the same for all participants.
We propose two specific tasks:
1. participants disambiguate the corpus, the corpus is expanded to synonyms and translations and we measure the effects on cross-lingual retrieval. Queries are not processed.
2. participants disambiguate the queries per language, we expand the queries to synonyms and translations and we measure the effects on cross-lingual retrieval. Documents are not processed
The corpora and queries will be obtained from the ad-hoc CLEF tasks. The scores can be compared among the Semeval participants but also with the past CLEF participants.
Supported languages for queries: English, Spanish
Supported languages in documents: English
The English CLEF data from years 2000-2003 covers 250 topics, 169.477 documents (579 MB). The relevance judgments will be taken from CLEF. This has the disadvantage of having been produced by pooling the results of CLEF participants, and might bias the results towards systems not using WSD, specially for monolingual English retrieval. A post-hoc analysis of the participants results will analyze the effects of this.
Expansion and retrieval
The retrieval engine will be an adaptation of the TwentyOne search system was developed during the 90's by the TNO research institute at Delft (The Netherlands). It is now further developed by Irion technologies as a cross-lingual retrieval system (Hiemstra & Kraaig, 1999; Vossen et al. 2006). It can be stripped down to a basic vector space retrieval system.
For expansion and translation we will use the publicly available Multilingual Central Repository (MCR, Atserias et al. 2004) from the MEANING project. The MCR follows the EuroWordNet design, and currently includes English, Spanish, Italian, Basque and Catalan wordnets tightly connected through the Interlingual Index (based on WordNet 1.6, but linked to all other WordNet versions).
The following work will be carried out by the organizers:
Instructions for participation
Please note the steps in order to participate:
by participating in Semeval-2007 participants give the grant for future CLEF-2008 participants to use their automatically annotated data for research purposes.
the results returned by participants need to conform to the dtd's as provided by the organizers. Otherwise we cannot guarantee to be able to score those results. Software to validate the results is provided in the task website.
given the high overload of expanding and scoring the systems, the dealine for uploading results is tighter than other Semeval tasks (19 of March).
given the amount of text to be tagged, participants have 2 weeks to submit results starting from test data download time.
General design
The participants will be provided with (see Full Description for more details):
1. the document collections (.nam+id files)
The participants need to return a single compressed file with the input files enriched with WordNet 1.6 sense tags:
1. for all the documents in the collection (.wsd files)
All files are in XML. Input documents and topics (.nam+id files) will follow the "docs.nam+id.dtd" and "topics.nam+id.dtd" dtds respectively. Output documents and topics (.wsd files) will follow the "docs.wsd.dtd" and "topics.wsd.dtd" dtds respectively.
The result files will be organized into a rich directory structure which follows the directory structure of the input.
See the trial data release for further details and examples of files.
Note that all senses returned by participants will be expanded, regardless of their weight. The current CLIR system does not use the weight information.
Additional data
We will also provide some of the widely used WSD features in a word-to-word fashion (Agirre et al. 2006) in order to make participation easier. These features will be available for both topics and documents (test data) as well as all the words with frequency above 10 in SemCor 1.6 (which can be taken as the training data for supervised WSD systems). They will be posten in the task webpage soon.
Evaluation
1. expand the returned sense tags to all synonyms in the target
The participant systems will be scored according to
standard IR/CLIR measures as implemented in the TREC evaluation package.Schedule
Jan. 3 Trial data available
Jan. 15 (aprox.) End user agreement (EUA) available in website
Feb. 26 Test data available fr download (signed EUA required)
Mar. 26 Deadline for uploading results
June SemEval Workshop
September CLEF conference
References
Agirre E., O. Lopez de Lacalle Lekuona , D. Martinez. "Exploring feature set combinations for WSD". In procceedings of the annual meeting of the SEPLN, Spain. 2006
Jordi Atserias, Luis Villarejo, German Rigau, Eneko Agirre, John Carroll, Bernardo Magnini, Piek Vossen. "The MEANING Multilingual Central Repository". In Proceedings of the Second International WordNet Conference-GWC 2004, pg. 23-30, January 2004, Brno, Czech Republic. ISBN 80-210-3302-9
Djoerd Hiemstra and Wessel Kraaij. Twenty-One at TREC-7: Ad Hoc and Cross Language track. In Ellen M. Voorhees and Donna K. Harman, editors, The Seventh Text Retrieval Conference (TREC-7), volume 7, 1999. National Institute of Standards and Technology, NIST. Note: NIST Special Publication 500-242. 1999.
Vossen, Piek, German Rigau, IƱaki Alegria, Eneko Agirre, David Farwell, Manuel Fuentes. "Meaningful results for Information Retrieval in the MEANING project". In Proceedings of the 3rd Global Wordnet Conference, Jeju Island, Korea, South Jeju, January 22-26, 2006.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4