SemEval-2007

Task #10: English Lexical Substitution

Organized by:
- Diana McCarthy University of Sussex, UK
- Roberto Navigli University of Rome "La Sapienza"., Italy
Questions
1. Q: Are verbs and other PoS among the target words?
2. Q: Each target word will have at least one substitute?
3. Q: Is it true that the word types (lexelt items) included in the trial data (e.g. bright, film, take, etc) are not necessarily the words that you will include in the test data?
4. Q: Will there be any training data? If so, will those lexelt items be representative of the test data, or will it be similar to the trial data (in that you make no claims about those words appearing in the test data)?
5. Q: I don't understand why score.pl uses 298 as the total number of items in the trial data when there are 300?
6. Q: We noticed that some suggested synonyms in the trial data were spelled in British English (e.g. bright.a #3 lists 'colourful' as the first suggested replacement). Will spelling differences be accounted for in the scoring, or should we make every effort to report our answers with British spellings?
7. Q: I have a question after learning about the examples you provided. The word to be substituted may be not the basic form, for example "<head>takes<\head>" if we have find "last" as a correct substitution of "take" in this given sentence, do we have to change "last" to "lasts". In other words, if we output "last", whether it will be judged as correct?
8. Q: If we participate in the Best or OOT method, are we expected to come up with the multi-word synonyms? Using your scorer (distributed with the trial data) it seems like we are penalised for not guessing these, even though MW is evaluated as a separate task. It would be useful (to us, at least) if there was a scoring method that only judged us on our ability to guess single-word replacements.
9. Q:Is it true that in the "best" scoring a precision of 100% is not always possible? For instance, in the example given in Section 4, here is what the various possible submissions would get for item 9999, if my understanding is correct: glad => 3/7 merry => 2/7 cheerful => 1/7 glad;merry;cheerful;jovial => (7/4)/7 = 1/4
You may also be interested to read some issues with the trial data that have been raised by other participants.
Contact
if you have questions please write to lexsub at sussex dot ac dot uk
Back to task web page

For more information, visit the SemEval-2007 home page.

Task #10: English Lexical Substitution

Questions

Contact