We propose neural models that can normalize text by considering the similarities of word strings and sounds. We experimentally compared a model that considers the similarities of both word strings and sounds, a model that considers only the similarity of word strings or of sounds, and a model without the similarities as a baseline. Results showed that leveraging the word string similarity succeeded in dealing with misspellings and abbreviations, and taking into account the sound similarity succeeded in dealing with phonetic substitutions and emphasized characters. So that the proposed models achieved higher F1 scores than the baseline.

Neural text normalization leveraging similarities of strings and sounds

POSTER8: Applications: grammar correction, support for language and script writing

poster

COLING, the International Conference on Computational Linguistics, is one of the premier conferences for natural language processing and computational linguistics. Often grouped within the field of artificial intelligence, but actually pre-dating the development of artificial intelligence, advances in computational linguistics and natural language processing are now some of the major drivers behind the use of artificial intelligence for commercial and social applications – for example, on-line search, machine translation and with voice-assisted conversational devices.

First established in 1965, the biennial COLING conference is held in diverse parts of the globe and attracts participants from both top-ranked research centers and emerging countries. Today, the most important developments in our field are taking place not only in universities and academic research institutes, but also in industrial research departments and in technological startups. COLING conferences provide opportunities for all these communities to showcase their exciting developments.

COLING 2020

COLING, the International Conference on Computational Linguistics, is one of the premier conferences for natural language processing and computational linguistics. Often grouped within the field of artificial intelligence, but actually pre-dating the development of artificial intelligence, advances in computational linguistics and natural language processing are now some of the major drivers behind the use of artificial intelligence for commercial and social applications – for example, on-line search, machine translation and with voice-assisted conversational devices.

Riku Kawamura

1

SHORT BIO

Presentations

Neural text normalization leveraging similarities of strings and sounds

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES