Building Language Models with Fuzzy Weights

Tsan-Jung He; Shie-Jue Lee; Chih-Hung Wu

ICCM Conferences, The 8th International Conference on Computational Methods (ICCM2017)

Tsan-Jung He, Shie-Jue Lee, Chih-Hung Wu

Last modified: 2017-05-13

Abstract

Word2Vec is a recently developed tool for building neural network language models. The purpose of this work is to propose an improvement to Word2Vec by adding fuzzy weights related to the distances in the context to use more information than the way adopted in the original linear bag-of-word structure. In Word2Vec, the same weights are given regardless of different distances between words. We consider that word distances in the context bear certain semantic sense which can be exploited to reinforce connections more effectively for the network model. In order to formalize the influence of different distances in the context, we adopt Gaussian functions to represent fuzzy weights which take part in the training of the connections of network.Various experiments show that our proposed improvement can result in better language models than Word2Vec.

Keywords

word2vec; skip-gram; fuzzy weights; neural network language model; word embedding; natural language processing(NLP)

An account with this site is required in order to view papers. Click here to create an account.