Publications
カンファレンス (国際) Summarization Based on Embedding Distributions
Hayato Kobayashi, Masaki Noguchi, Taichi Yatsuka
the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015)
2015.9.21
In this study, we consider a summarization
method using the document level similarity
based on embeddings, or distributed
representations of words, where we assume
that an embedding of each word can
represent its “meaning.” We formalize our
task as the problem of maximizing a submodular
function defined by the negative
summation of the nearest neighbors’ distances
on embedding distributions, each
of which represents a set of word embeddings
in a document. We proved the submodularity
of our objective function and
that our problem is asymptotically related
to the KL-divergence between the probability
density functions that correspond
to a document and its summary in a continuous
space. An experiment using a
real dataset demonstrated that our method
performed better than the existing method
based on sentence-level similarity.
Poster Download (700KB)