Publications

JOURNAL (INTERNATIONAL) Doc2vec-Based Approach for Extracting Diverse Evaluation Expressions from Online Review Data

Kosuke Kurihara, Yoshiyuki Shoji (Aoyama Gakuin univ.), Sumio Fujita, Martin J. Dürst (Aoyama Gakuin univ.)

Journal of Data Intelligence (JDI)

June 01, 2022

This paper proposes a method for extracting diverse expressions from online movie review texts for a given keyword query. When people watch a movie that makes them cry, they generally do not say \I cried." Instead, they use such euphemistic language as \I needed a handkerchief" or \My makeup was running." To enable information retrieval based on audience reactions such as \movies that make me cry" using review texts, various paraphrased expressions must be collected for arbitrary queries. Our proposed method extracts such expressions from review datasets by applying two extensions to Doc2Vec: 1) it changes the granularity of the training sentences to mitigate a lack of context, and 2) it applies query expansion for similarity calculation in advance. We conducted a large-scale crowdsourcing experiment with 1.29 million actual sentences taken from Yahoo! Movies, Japan. The experimental result revealed that changing the training data granularity and adding the query expansion e ectively collect more diverse expressions that have a meaning similar to the given query.

Paper : Doc2vec-Based Approach for Extracting Diverse Evaluation Expressions from Online Review Dataopen into new tab or window (external link)