Benefits of machine learning and sampling frequency on phytoplankton bloom forecasts in coastal areas

2020 
Abstract In aquatic ecosystems, anthropogenic activities disrupt nutrient fluxes, thereby promoting harmful algal blooms that could directly impact economies and human health. Within this framework, the forecasting of the proxy of chlorophyll a in coastal areas is the first step to managing these algal blooms. The primary goal was to analyze how phytoplankton bloom forecasts are impacted by different sampling frequencies, by using a machine learning model. The database used in this study was sourced from an automated system located in the English Channel. This device has a sampling frequency of 20 min. We considered 12 physicochemical parameters over a six-year period. Our forecast methodology is based on the random forest (RF) model and a sliding window strategy. The lag times for these sliding windows ranged from 12 h to 3 months with four different sampling times until 1 day. The results indicate that the optimal forecast was obtained for a 20 min time step, with an average R2 of 0.62. Moreover, the highest values of fluorescence were predicted when the water temperature was approximately 11.8 °C. Consequently, we demonstrated that the sampling frequency directly impacts the forecast performance of an RF model. Furthermore, this kind of model can recreate interactions that closely resemble biological processes. Our study suggests that the RF model can utilize the additional information contained in high-frequency datasets. The methodology presented here lays the foundation for the development of a numerical decision-making tool that could help mitigate the impact of these algal blooms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    66
    References
    3
    Citations
    NaN
    KQI
    []
    Baidu
    map