Abstract: Idiomatic phrases are natural components of all languages that cannot be comprehended straight from the word from which they are generated. Vector representations are a key method that bridges the human understanding of language to that of machines and solves many NLP problems. Idiomatic expression representation is necessary for machine learning, deep learning, and natural language processing applications. Machine learning and deep learning techniques have not been used to process text as input for natural language processing applications in previous literature. As such, in order to study natural language processing with machine learning and deep learning methods, vector or numeric representations of idiomatic statements are needed. Therefore, this research aimed at the proposed vector representation of Amharic idioms for NLP applications through vector representation models. Researchers that study natural language processing use this format, and for classification or regression, they employ machine learning and deep learning techniques. Before doing NLP application researches on Amharic idiom, first, it requires vector or numeric representation using suitable methods. We used five hundred idiomatic expressions from Amharic Idioms book as a dataset for this representation, which are comprised of two words. To evaluate performance, we employed the accuracy, precision, recall, and F-score metrics. The dataset produced a result of 95.5% accuracy.
Abstract: Idiomatic phrases are natural components of all languages that cannot be comprehended straight from the word from which they are generated. Vector representations are a key method that bridges the human understanding of language to that of machines and solves many NLP problems. Idiomatic expression representation is necessary for machine learning, ...Show More