Quotation Hashemi, Mahdi, Hall, Margeret. 2020. Multi-label classification and knowledge extraction from oncology-related content on online social networks. Artificial Intelligence Review. 53 5957-5994.




This study aims at automatic processing and knowledge extraction from large amounts of oncology-related content from online social networks (OSN). In this context, a large number of OSN textual posts concerning major cancer types are automatically scraped and structured using natural language processing techniques. Machines are trained to assign multiple labels to these posts based on the type of knowledge enclosed, if any. Trained machines are used to automatically classify large-scale textual posts. Statistical inferences are made based on these predictions to extract general concepts and abstract knowledge. Different approaches for constructing document feature vectors showed no tangible effect on the classification accuracy. Among different classifiers, logistic regression achieved the highest overall accuracy (96.4%) and F1¯¯¯¯¯¯ (73.4) in a 13-way multi-label classification of textual posts. The most common topic was seeking or providing moral support for cancer patients, followed by providing technical information about cancer causes and treatments. The most common causes and treatments of different types of cancer on OSN are also automatically detected in this study. Seeking or providing moral support for cancer patients shared the largest overlap with other topics, i.e. moral support tends to be present even in OSN posts which focus on other topics. On the other hand, providing technical information about cancer diagnosis or prevention were the most isolated topics, where OSN posts tend not to allude to other topics. OSN posts which seek financial support only overlap with the moral support topic, if any. Our methodology and results provide public health professionals with an opportunity to monitor what topics and to which extent are being discussed on OSN, what specific information and knowledge are being disseminated over OSN, and to assess their veracity in close to real time.


Press 'enter' for creating the tag

Publication's profile

Status of publication Published
Affiliation External
Type of publication Journal article
Journal Artificial Intelligence Review
Citation Index SCI
Language English
Title Multi-label classification and knowledge extraction from oncology-related content on online social networks
Volume 53
Year 2020
Page from 5957
Page to 5994
URL https://link.springer.com/content/pdf/10.1007/s10462-020-09839-0.pdf
DOI http://dx.doi.org/10.1007/s10462-020-09839-0
Open Access N


Hall, Margeret (Details)
Hashemi, Mahdi (George Mason University, United States/USA)
Institute for Cognition & Behavior IN (Details)
Research areas (ÖSTAT Classification 'Statistik Austria')
1127 Information science (Details)
Google Scholar: Search