Lexical substitution, i.e. generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word sense induction and disambiguation, lexical relation extraction, data augmentation, etc. In this paper, we present a large-scale comparative study of lexical substitution methods employing both rather old and most recent language and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, RoBERTa, XLNet. We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly. Several existing and new target word injection methods are compared for each LM/MLM using both intrinsic evaluation on lexical substitution datasets and extrinsic evaluation on word sense induction (WSI) datasets. On two WSI datasets we obtain new SOTA results. Besides, we analyze the types of semantic relations between target words and their substitutes generated by different models or given by annotators.

Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

POSTER5: Semantics 1

poster

COLING, the International Conference on Computational Linguistics, is one of the premier conferences for natural language processing and computational linguistics. Often grouped within the field of artificial intelligence, but actually pre-dating the development of artificial intelligence, advances in computational linguistics and natural language processing are now some of the major drivers behind the use of artificial intelligence for commercial and social applications – for example, on-line search, machine translation and with voice-assisted conversational devices.

First established in 1965, the biennial COLING conference is held in diverse parts of the globe and attracts participants from both top-ranked research centers and emerging countries. Today, the most important developments in our field are taking place not only in universities and academic research institutes, but also in industrial research departments and in technological startups. COLING conferences provide opportunities for all these communities to showcase their exciting developments.

COLING 2020

COLING, the International Conference on Computational Linguistics, is one of the premier conferences for natural language processing and computational linguistics. Often grouped within the field of artificial intelligence, but actually pre-dating the development of artificial intelligence, advances in computational linguistics and natural language processing are now some of the major drivers behind the use of artificial intelligence for commercial and social applications – for example, on-line search, machine translation and with voice-assisted conversational devices.

Boris Sheludko

1

SHORT BIO

OTHER AFFILIATIONS

Presentations

Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES