Other

Basic information

Name HASHIMOTO, Takako
Belonging department
Occupation name
researchmap researcher code
researchmap agency

Title

Topic Extraction from Millions of Tweets using Singular Value Decomposition and Feature Selection

Sole or Joint Author

Joint Author

Date of Issue

2015/12

Conference Presentation(name)

Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2015

Summary

To overcome the scalability problem for big data analysis, in this paper, high performance Singular Vector Decomposition (SVD) library redsvd has been used to identify topics over time from the huge data set of over two hundred million tweets sent in the 21 days following the Great East Japan Earthquake. While we begin with word count vectors of authors and words for each time slot (in our case, every hour), authors’ clusters from each slot are extracted by SVD and k-means. And then, the original fast feature selection algorithm named CWC has been used to extract discriminative words from each cluster. As aresult, authors’ clusters recognized as topics as well as issues of conventional social media analysis method for big data can be visualized overcoming the scalability problem.

Subject1

Subject2

Subject3