カンファレンス (国際) Early Discovery of Disappearing Entities in Microblogs

Satoshi Akasaki, Naoki Yoshinaga (The University of Tokyo), Masashi Toyoda (The University of Tokyo)

The 61st Annual Meeting of the Association for Computational Linguistics (ACL2023)


We make decisions by reacting to changes in the real world, particularly the emergence and disappearance of impermanent entities like restaurants, services, and events. Because we want to avoid missing out on opportunities or making fruitless actions after those entities have disappeared, it is important to know when entities disappear as early as possible. We thus tackle the task of detecting disappearing entities from microblogs where various information is shared timely. The major challenge is detecting uncertain contexts of disappearing entities from noisy microblog posts. To collect such disappearing contexts, we design time-sensitive distant supervision, which utilizes entities from the knowledge base and time-series posts. Using this method, we actually build large-scale Twitter datasets of disappearing entities for English and Japanese. To ensure robust detection in noisy environments, we refine pretrained word embeddings for the detection model on microblog streams in a timely manner. Experimental results on the Twitter datasets confirmed the effectiveness of the collected labeled data and refined word embeddings; the proposed method outperformed a baseline in terms of accuracy, and more than 70% of the detected disappearing entities in Wikipedia are discovered earlier than the update on Wikipedia, with the average lead-time is over one month.