Auto-labeling of sensor data using social media messages: a case study for a smart city

2021 
Recently, the deployment of various Internet of Things (IoT) sensors has encouraged smart cities to accumulate a large volume of data. When machine learning models utilize such accumulated raw data to predict events and situations, various systems of a smart city, such as traffic accident management systems, can be further developed by utilizing the predicted events and situations. However, although there has been a large volume of various IoT raw data on smart cities, such data do not have labels that are related to events or situations. Data with meaningful labels are required for the training of the models. Because these sensor data do not have meaningful labels, the data cannot be utilized into the models. There are several existing methods for labeling, but they have different drawbacks. In this study, we investigate the feasibility of utilizing social media messages to extract meaningful labels for machine learning to predict events and situations in smart city environments. As a case study, we compared the extracted labels from social media messages with the events and situations found in announced traffic news, and other articles. The results show that it is feasible to utilize social media messages as a source for meaningful labels of events and situations. We also propose an improved clustering algorithm using an outlier detection technique to extract meaningful labels in a more robust way. Furthermore, for other researchers who want to utilize the IoT raw data, we analyze and release the refined sensor data on which there were unknown noise.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map