Abstrakt
Text mining of social media data for enhancing food safety of farmer's market
Dandan Tao, Hao Feng
This study was conducted to analyze consumers’ reviews and/or comments posted on social media to recognize potential food safety issues associated with farmer’s markets. Text mining models were built using data from Yelp and Twitter to automatically identify consumers’ responses on food safety after visiting a farmer’s market. Besides food safety, other aspects such as quality, availability, and environment were also considered in data analysis models. Machine learning tools, including Naïve Bayes (NB), Support Vector Machines (SVM), Logistic Regressions (LR), k-Nearest Neighbor (k-NN), and Random Forests (RF), were used to build the models, and SVM was identified as the optimal model with highest F-1 score, a parameter for evaluating correct identification, on both Twitter (0.68) data and Yelp data (0.75). Based on the SVM models, the most important words used in the classification were identified. For the topic “safety”, the important words identified from two datasets were different. On Twitter, words like ‘safety’, ‘coli’, ‘health’, ‘recall’, ‘illness’, ‘train’, ‘tip’, ‘foodborne’ were commonly mentioned, indicating people intend to talk about issues of foodborne outbreaks. On Yelp, people tend to comment on the hygiene conditions of a farmer’s market with words like ‘clean’, ‘messy’, ‘safety’, ‘gross’, ‘rotten’. The findings could help local public health departments know about the hygiene status or potential food safety issues of a farmer’s market based on consumer reviews.