This page contains pointers to resources I have produced or helped to produce. Do send me an email if there is a resource that you need which is missing.
- Multiwords
- Data for the paper:
- Reddy, Siva, Diana McCarthy and Suresh Manandhar (2011) An Empirical Study on Compositionality in Compound Nouns In Proceedings of the International Joint Conference on Natural Language Processing 2011 (IJCNLP-2011), Thailand All data and guidelines are available here. Credit to Siva Reddy!
- Data for the paper:
- Diana McCarthy, Sriram Venkatapathy and Aravind K. Joshi (2007) Detecting Compositionality of Verb-Object Combinations using Selectional Preferences In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2007) pp 369-379 data is available here
Note that this is the 638 subset of verb-object data (i.e. excluding non common noun objects) of the original data produced for
Sriram Venkatapathy and Aravind K. Joshi. (2005) Relative compositionality of multi-word expressions: a study of Verb-Noun (V-N) collocations. In Proceedings of International Joint Conference on Natural Language Processing - 2005, Jeju Island, Korea.
and released here with kind permission from Sriram Venkatapathy along with the annotator guidelines that they used.- Data for the paper:
- Diana McCarthy, Bill Keller, and John Carroll (2003) Detecting a Continuum of Compositionality in Phrasal Verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment , Sapporo, Japan. Gold Standard Data avaliable here
- Word senses
- Data for:
- Katrin Erk, Diana McCarthy and Nick Gaylord (2013) Measuring Word Meaning in Context. Computational Linguistics 39 (3) pp 511-554 DOI 10.1162/COLI_a_00142. Data available here
- Resources and Data for:
- Siva Reddy, Abilash Inumella, Diana McCarthy and Mark Stevenson (2010) IIITH: Domain Specific Word Sense Disambiguation. In Proceedings of SemEval-2010: 5th International Workshop on Semantic Evaluations ACL 2010, Uppsala, Sweden. Download here Credit to Siva Reddy and Abilash Inumella!
- Data for:
- Katrin Erk, Diana McCarthy and Nick Gaylord (2009) Investigations on Word Senses and Word Usages In Proceedings of the Joint conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing ACL-IJCNLP Singapore Data and Annotator Instructions are available here
- Gold Standard Data for:
- Diana McCarthy (2006) Relating WordNet senses for word sense disambiguation In Proceedings of the ACL Workshop on Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together , Trento, Italy pp 17-24 Gold Standard Data avaliable here
- Gold Standard Data for the paper:
- Rob Koeling, Diana McCarthy, and John Carroll (2005) Domain-Specific Sense Distributions and Predominant Sense Acquisition. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. HLT/EMNLP 2005 pp 419-426. Gold Standard Data avaliable here Credit to Rob Koeling
- Gold Standard Data for the paper:
- Andrew Bennett, Timothy Baldwin, Jey Han Lau, Diana McCarthy and Francis Bond (2016) LexSemTM: A Semantic Dataset Based on All-words Unsupervised Sense Distribution Learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany. pdf available and code and data available here. Credit to Andrew Bennett
- Lexical Substition
- Data for the invited paper:
- Diana McCarthy (2011) Measuring similarity of word meaning in context with lexical substitutes and translations In Computational Linguistics and Intelligent Text Processing 12th International Conference, CICLing 2011, Tokyo, Japan, February 20-26, 2011. Proceedings, Part I. Lecture Notes in Computer Science Volume 6608, 2011, DOI: 10.1007/978-3-642-19400-9 pp 238-252 slides here and data and R code download here
- Cross Lingual Substitution Data:
- Rada Mihalcea, Ravi Sinha and Diana McCarthy (2010) SemEval-2010 Task 2: Cross-Lingual Lexical Substitution In Proceedings of SemEval-2010: 5th International Workshop on Semantic Evaluations ACL 2010, Uppsala, Sweden. Data available to download here
- All Resources for the LEXSUB task:
Diana McCarthy, and Roberto Navigli (2007) SemEval-2007 Task 10: English Lexical
Substitution Task In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague,
Czech Republic pp 48-53
All resources available to download here