Forecasting of surface current velocities using ensemble machine learning algorithms for the Guangdong-Hong Kong-Macao Greater Bay Area based on the High Frequency radar data
-
Abstract: Forecasting of ocean currents is critical for both marine meteorological research and ocean engineering and construction. Timely and accurate forecasting of coastal current velocities offers a scientific foundation and decision support for multiple practices such as search and rescue, disaster avoidance and remediation, and offshore construction. This research established a framework to generate short-term surface current forecasts based on ensemble machine learning trained on High Frequency radar observation. Results indicate that an ensemble algorithm that used random forests to filter forecasting features by weighting them, and then used the AdaBoost method to forecast can significantly reduce the model training time, while ensuring the model forecasting effectiveness, with great economic benefits. Model accuracy is a function of surface current variability and the forecasting horizon. In order to improve the forecasting capability and accuracy of the model, the model structure of the ensemble algorithm was optimized, and the random forest algorithm was used to dynamically select model features. The results show that the error variation of the optimized surface current forecasting model has a more regular error variation, and the importance of the features varies with the forecasting time-step. At ten-step ahead forecasting horizon the model reported RMSE, MAE and correlation coefficient by 2.84 cm/s, 2.02 cm/s, and 0.96, respectively. The model error is affected by factors such as topography, boundaries, and geometric accuracy of the observation system. This paper demonstrates the potential of ensemble-based machine learning algorithm to improve forecasting of ocean currents.
-
Key words:
- forecasting /
- surface currents /
- ensemble machine learning /
- high Frequency radar /
- random forest /
- AdaBoost
-
Table 1. Statistics of tidal ellipticity, tidal type coefficients and shallow water coefficients
Point Tidal ellipticity Tidal type
coefficientShallow water
coefficientO1 K1 M2 S2 P1 –0.4 –0.06 –0.22 0.12 1.51 0.32 P2 –0.1 –0.11 –0.02 0.21 1.11 0.2 P3 –0.07 –0.2 –0.02 0.29 1.09 0.2 P4 0.22 –0.34 –0.05 –0.12 1.27 0.19 P5 –0.37 –0.27 –0.31 0.44 1.68 0.24 Table 2. Statistics of the optimal correlation coefficients of surface current components
Point Correlation coefficient Lag time/h Zonal component Meridional component Zonal component Meridional component P1 0.75 0.31 12 11 P2 0.78 0.31 12 8 P3 0.79 0.22 11 9 P4 0.68 0.24 9 9 P5 0.55 –0.09 11 9 Table 3. Forecasting model set-up
Model Algorithm Feature Feature selection Forecasting model Baseline model AdaBoost total observation data Application model Random Forest AdaBoost observation data whose total variable importance accumulates over 95% Rolling model Random Forest AdaBoost observation data whose importance accumulates over 95% at the first time-step Reconstruction model Random Forest AdaBoost observation data whose importance accumulates over 95% at each time-step Table 4. Assessment statistics of single time-step forecasting (training dataset)
Point Statistic value Zonal component of surface velocity Meridional component of surface velocity Feature number RMSE/(cm·s–1) R MAE/(cm·s–1) Feature number RMSE/(cm·s–1) r MAE/(cm·s–1) P1 176 0.96 1.00 0.55 176 0.87 1.00 0.4 8 1.04 1.00 0.58 8 0.94 1.00 0.58 P2 176 1.46 1.00 0.82 176 1.31 0.99 0.73 9 1.56 1.00 0.85 8 1.42 0.99 0.76 P3 176 0.96 1.00 0.52 176 1.06 1.00 0.45 8 1.04 1.00 0.55 8 1.14 0.99 0.46 P4 176 2.41 0.99 1.29 176 2.52 0.98 1.35 9 2.6 0.99 1.36 36 2.68 0.98 1.36 P5 176 3.55 0.98 2.11 176 1.69 0.99 0.81 13 4.06 0.98 2.16 13 1.8 0.99 0.85 Table 5. Assessment statistics of single time-step forecasting (test dataset)
Point Statistic value Zonal component (u) Meridional component (v) Feature number RMSE/(cm·s–1) R MAE/(cm·s–1) Feature number RMSE/(cm·s–1) r MAE/(cm·s–1) P1 8 0.95 1.00 0.62 8 1.66 0.99 0.67 P2 9 2.41 0.99 1.00 8 2.05 0.98 0.85 P3 8 1.55 1.00 0.71 8 3.68 0.97 1.04 P4 9 5.02 0.98 1.62 36 9.36 0.94 3.29 P5 13 2.97 0.98 1.75 10 1.90 0.99 0.84 Table 6. Evaluation of multi time-step forecasting (zonal component of surface velocity)
Point Forecasting
stepStatistic value Feature
numberRMSE/
(cm·s–1)r MAE/
(cm·s–1)P1 2 8 1.82 1.00 1.09 3 8 2.59 1.00 1.61 4 9 3.33 0.99 2.16 5 10 3.93 0.99 2.6 6 12 4.47 0.99 3.02 7 16 4.56 0.99 3.12 8 23 4.6 0.99 3.13 9 32 4.72 0.99 3.27 10 37 4.97 0.99 3.45 P2 2 8 2.51 1.00 1.49 3 9 3.57 0.99 2.16 4 10 4.19 0.99 2.62 5 11 4.69 0.98 3.02 6 16 4.91 0.98 3.17 7 25 4.92 0.98 3.26 8 33 4.81 0.98 3.27 9 39 4.97 0.98 3.4 10 43 5.01 0.98 3.5 P3 2 8 1.7 1.00 0.99 3 8 2.27 1.00 1.4 4 9 2.83 0.99 1.8 5 10 3.29 0.99 2.19 6 13 3.61 0.99 2.45 7 18 3.52 0.99 2.39 8 26 3.67 0.99 2.5 9 34 3.74 0.99 2.62 10 43 3.93 0.99 2.81 P4 2 12 3.84 0.99 2.18 3 22 4.74 0.98 2.78 4 35 5.3 0.97 3.21 5 48 5.6 0.97 3.48 6 58 5.71 0.97 3.65 7 66 5.41 0.97 3.54 8 74 5.58 0.97 3.68 9 80 5.67 0.97 3.79 10 85 5.84 0.97 3.92 P5 2 58 4.48 0.97 2.72 3 85 5.51 0.95 3.46 4 100 5.8 0.95 3.75 5 108 6.08 0.94 4.02 6 114 6 0.94 4.04 7 118 6.29 0.94 4.24 8 120 6.39 0.94 4.36 9 122 6.43 0.94 4.4 10 123 6.68 0.93 4.61 Table 7. Evaluation of multi time-step forecasting (meridional component of surface velocity)
Point Forecasting
stepsStatistic value Feature
numberRMSE/
(cm·s–1)r MAE/
(cm·s–1)P1 2 9 1.58 0.99 0.78 3 25 2.03 0.98 1.08 4 40 2.35 0.97 1.35 5 56 2.53 0.97 1.54 6 74 2.59 0.97 1.67 7 89 2.46 0.97 1.68 8 102 2.67 0.97 1.85 9 111 2.63 0.97 1.87 10 117 2.84 0.96 2.02 P2 2 18 2.12 0.98 1.24 3 46 2.63 0.97 1.66 4 72 2.96 0.96 1.99 5 90 3.44 0.95 2.30 6 102 3.56 0.95 2.54 7 112 3.82 0.94 2.63 8 176 4.12 0.93 2.88 9 126 4.15 0.93 2.90 10 131 4.33 0.92 3.02 P3 2 10 1.78 0.99 0.82 3 19 2.12 0.98 1.08 4 36 2.28 0.98 1.24 5 53 2.44 0.97 1.40 6 65 2.55 0.97 1.54 7 76 2.57 0.97 1.62 8 86 2.64 0.97 1.72 9 93 2.71 0.97 1.79 10 100 2.89 0.96 1.95 P4 2 84 4.99 0.92 3.08 3 109 4.53 0.93 2.82 4 121 4.71 0.93 2.96 5 129 5.17 0.91 3.27 6 134 5.48 0.90 3.55 7 137 5.49 0.90 3.62 8 141 5.71 0.89 3.75 9 143 5.79 0.89 3.85 10 145 6.29 0.87 4.15 P5 2 40 2.46 0.97 1.29 3 67 3.01 0.96 1.67 4 83 3.31 0.95 1.94 5 95 3.97 0.93 2.36 6 104 3.42 0.95 2.18 7 111 3.43 0.95 2.28 8 116 3.50 0.95 2.34 9 121 3.82 0.94 2.60 10 125 4.42 0.91 3.01 -
Ali J, Khan R, Ahmad N, et al. 2012. Random forests and decision trees. International Journal of Computer Science Issues, 9(5): 272–278 Aydoğan B, Ayat B, Öztürk M N, et al. 2010. Current velocity forecasting in straits with artificial neural networks, a case study: Strait of Istanbul. Ocean Engineering, 37(5/6): 443–453, doi: 10.1016/j.oceaneng.2010.01.016 Barrick D E, Headrick J M, Bogle R W, et al. 1974. Sea backscatter at HF: Interpretation and utilization of the echo. Proceedings of the IEEE, 62(6): 673–680, doi: 10.1109/PROC.1974.9507 Basañez A, Pérez-Muñuzuri V. 2021. HF radars for wave energy resource assessment offshore NW Spain. Remote Sensing, 13(11): 2070, doi: 10.3390/rs13112070 Bradbury M C, Conley D C. 2021. Using artificial neural networks for the estimation of subsurface tidal currents from high-frequency radar surface current measurements. Remote Sensing, 13(19): 3896, doi: 10.3390/rs13193896 Breiman L. 2001. Random forests. Machine Learning, 45(1): 5–32, doi: 10.1023/A:1010933404324 Chen Yuru, Paduan J D, Cook M S, et al. 2021. Observations of surface currents and tidal variability off of northeastern taiwan from shore-based high frequency radar. Remote Sensing, 13(17): 3438, doi: 10.3390/rs13173438 Cheng Peng, Valle-Levinson A. 2009. Influence of lateral advection on residual currents in microtidal estuaries. Journal of Physical Oceanography, 39(12): 3177–3190, doi: 10.1175/2009JPO4252.1 Cosoli S, Pattiaratchi C, Hetzel Y. 2020. High-frequency radar observations of surface circulation features along the south-western Australian coast. Journal of Marine Science and Engineering, 8(2): 97, doi: 10.3390/jmse8020097 Dinh V N, McKeogh E. 2019. Offshore wind energy: technology opportunities and challenges. In: Proceedings of the 1st Vietnam Symposium on Advances in Offshore Engineering. Singapore: Springer Fang Shenguang, Xie Yufeng, Cui Liqin. 2015. Analysis of tidal prism evolution and characteristics of the Lingdingyang Bay at Pearl River estuary. MATEC Web of Conferences, 25: 01006, doi: 10.1051/matecconf/20152501006 Han Qinghua, Gui Changqing, Xu Jie, et al. 2019. A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm. Construction and Building Materials, 226: 734–742, doi: 10.1016/j.conbuildmat.2019.07.315 Hastie T, Tibshirani R, Friedman J. 2009. Random forests. In: Hastie T, Tibshirani R, Friedman J, eds. The Elements of Statistical Learning. New York: Springer, 587–604 Immas A, Do N, Alam M R. 2021. Real-time in situ prediction of ocean currents. Ocean Engineering, 228: 108922, doi: 10.1016/j.oceaneng.2021.108922 Jishun R. 1991. On the geotectonics of southern China. Acta Geologica Sinica‐English Edition, 4(2): 111–130, doi: 10.1111/j.1755-6724.1991.mp4002001.x Johnston K, Ver Hoef J M, Krivoruchko K, et al. 2001. Using ArcGIS Geostatistical Analyst. Redlands: Esri Redlands Kim S J, Kõrgersaar M, Ahmadi N, et al. 2021. The influence of fluid structure interaction modelling on the dynamic response of ships subject to collision and grounding. Marine Structures, 75: 102875, doi: 10.1016/j.marstruc.2020.102875 Klemas V. 2011. Remote sensing techniques for studying coastal ecosystems: an overview. Journal of Coastal Research, 27(1): 2–17 Li Ruixiang, Chen Changsheng, Xia Huayong, et al. 2014. Observed wintertime tidal and subtidal currents over the continental shelf in the northern South China Sea. Journal of Geophysical Research: Oceans, 119(8): 5289–5310, doi: 10.1002/2014JC 009931 Li Chuan, Wu Xiongbin, Yue Xianchang, et al. 2017. Extraction of wind direction spreading factor from broad-beam high-frequency surface wave radar data. IEEE Transactions on Geoscience and Remote Sensing, 55(9): 5123–5133, doi: 10.1109/TGRS.2017.2702394 Lin Mingsen, Xu Dewei, Li Xiaosun. 2003. Application of satellite data in monsoon and circulation of south China sea. In: Proceedings of SPIE 4892, Ocean Remote Sensing and Applications. Hangzhou: SPIE Liu Qinyu, Kaneko A, Jilan S. 2008. Recent progress in studies of the South China Sea circulation. Journal of Oceanography, 64(5): 753–762, doi: 10.1007/s10872-008-0063-8 Liu Zhen, Zhang Zhilong, Zhou Cuiying, et al. 2021. An adaptive inverse-distance weighting interpolation method considering spatial differentiation in 3D geological modeling. Geosciences, 11(2): 51, doi: 10.3390/geosciences11020051 Ma Lei, Fu Tengyu, Blaschke T, et al. 2017. Evaluation of feature selection methods for object-based land cover mapping of unmanned aerial vehicle imagery using random forest and support vector machine classifiers. ISPRS International Journal of Geo-Information, 6(2): 51, doi: 10.3390/ijgi6020051 Mantovani C, Corgnati L, Horstmann J, et al. 2020. Best practices on high frequency radar deployment and operation for ocean current measurement. Frontiers in Marine Science, 7: 210, doi: 10.3389/fmars.2020.00210 Mao Xiaojun, Peng Liuhua, Wang Zhonglei. 2022. Nonparametric feature selection by random forests and deep neural networks. Computational Statistics & Data Analysis, 170: 107436 Mitchell T M. 1999. Machine learning and data mining. Communications of the ACM, 42(11): 30–36, doi: 10.1145/319382.319388 Paduan J D, Washburn L. 2013. High-frequency radar observations of ocean surface currents. Annual Review of Marine Science, 5: 115–136, doi: 10.1146/annurev-marine-121211-172315 Port A, Gurgel K W, Staneva J, et al. 2011. Tidal and wind-driven surface currents in the German Bight: HFR observations versus model simulations. Ocean Dynamics, 61(10): 1567–1585, doi: 10.1007/s10236-011-0412-9 Ren Lei, Hartnett M. 2017a. Sensitivity analysis of a data assimilation technique for hindcasting and forecasting hydrodynamics of a complex coastal water body. Computers & GeoSciences, 99: 81–90 Ren Lei, Hartnett M. 2017b. Prediction of surface currents using high frequency CODAR data and decision tree at a marine renewable energy test site. Energy Procedia, 107: 345–350, doi: 10.1016/j.egypro.2016.12.171 Ren Lei, Hu Zhan, Hartnett M. 2018. Short-term forecasting of coastal surface currents using high frequency radar data and artificial neural networks. Remote Sensing, 10(6): 850, doi: 10.3390/rs10060850 Sagi O, Rokach L. 2018. Ensemble learning: A survey. WIREs: Data Mining and Knowledge Discovery, 8(4): e1249, doi: https://doi.org/10.1002/widm.1249 Sun Shuo, Zhang Qianli, Sun Junzhong, et al. 2022. Lead–acid battery SOC Prediction using improved adaBoost algorithm. Energies, 15(16): 5842, doi: 10.3390/en15165842 Vavatsikos A P, Sotiropoulou K F, Tzingizis V. 2022. GIS-assisted suitability analysis combining PROMETHEE II, analytic hierarchy process and inverse distance weighting. Operational Research, 22(5): 5983–6006, doi: 10.1007/s12351-022-00706-0 Vilibić I, Šepić J, Mihanović H, et al. 2016. Self-organizing maps-based ocean currents forecasting system. Scientific Reports, 6: 22924, doi: 10.1038/srep22924 Wang Lina, Cao Yu, Deng Xilin, et al. 2023a. Significant wave height forecasts integrating ensemble empirical mode decomposition with sequence-to-sequence model. Acta Oceanologica Sinica, 42(10): 54–66, doi: 10.1007/s13131-023-2246-y Wang Yuchen, Imai K, Mulia I E, et al. 2023b. Data Assimilation using high-frequency radar for tsunami early warning: a case study of the 2022 Tonga Volcanic Tsunami. Journal of Geophysical Research: Solid Earth, 128(2): e2022JB025153, doi: 10.1029/2022JB025153 Wang Wenxiong, Rainbow P S. 2020. Environmental Pollution of the Pearl River Estuary, China. Berlin: Springer Wang Shuangling, Zhou Fengxia, Chen Fajin, et al. 2021. Spatiotemporal distribution characteristics of nutrients in the drowned tidal inlet under the influence of tides: a case study of Zhanjiang Bay, China. International Journal of Environmental Research and Public Health, 18(4): 2089, doi: 10.3390/ijerph 18042089 Wei Xing, Cai Shuqun, Zhan Weikang. 2021. Impact of anthropogenic activities on morphological and deposition flux changes in the Pearl River Estuary, China. Scientific Reports, 11(1): 16643, doi: 10.1038/s41598-021-96183-0 Wu Yanli, Ke Yutian, Chen Zhuo, et al. 2020. Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping. CATENA, 187: 104396, doi: 10.1016/j.catena.2019.104396 Xie Lili, Liu Xia, Yang Qingshu, et al. 2015. Variations of current and sediment transport in Lingding Bay during spring tide in flood season driven by human activities. Journal of Sediment Research (in Chinese), (3): 56–62 Yang Yun. 2017. Temporal Data Mining via Unsupervised Ensemble Learning. Amsterdam: Elsevier Yang Liling, Yang Fang, Yu Shunchao, et al. 2021. The hydrodynamic division of lingdingyang estuary and its application in the impact analysis of large water-related projects. IOP Conference Series: Earth and Environmental Science, 643(1): 012135., doi: 10.1088/1755-1315/643/1/012135 Ye A L, Robinson I S. 1983. Tidal dynamics in the South China Sea. Geophysical Journal International, 72(3): 691–707, doi: 10.1111/j.1365-246X.1983.tb02827.x Yin Xunqiang, Shi Junqiang, Qiao Fangli. 2018. Evaluation on surface current observing network of high frequency ground wave radars in the Gulf of Thailand. Ocean Dynamics, 68(4): 575–587
计量
- 文章访问数: 58
- HTML全文浏览量: 25
- 被引次数: 0