
Citation: | Zhigao Chen, Yan Zong, Zihao Wu, Zhiyu Kuang, Shengping Wang. Prediction of discharge in a tidal river using the LSTM-based sequence-to-sequence models[J]. Acta Oceanologica Sinica, 2024, 43(7): 40-51. doi: 10.1007/s13131-024-2343-6 |
Many metropolises in the world (e.g., Cairo, Shanghai and New Orleans) lie in the river estuaries that need reliable discharge observations and predictions to cope with hydraulic engineering construction, water resources management and ecological environment protection. Other water-related challenges that call for similar data include disaster mitigation (such as drought and flood mitigation), irrigation and drinking water allocation (Hidayat et al., 2014). Similar to many other hydrological processes that exhibit an obvious nonlinearity, non-synchronization, and uncertainty in parameter estimates (e.g., flow velocity and water level), discharge estimation and prediction at tidal reaches are still a great challenge (Zhao et al., 2016). One of the computational tools ideal to address this problem is the artificial neural network (ANN) or recurrent neural networks (RNN). Although a physically based model is superior at simulating catchment processes, it frequently performs poorly, especially in areas with little access to data. An ANN or an RNN can generate a reasonably accurate prediction that can be utilized as a tool for operational watershed management by using past data only and excluding catchment processes.
Long short-term memory (LSTM) networks have become effective modeling tools for the hydrological cycle in recent years (Bai and Xu, 2021; Anshuka et al., 2022; Lees et al., 2022). Applications include monthly runoff forecasting (Yuan et al., 2018), low-flow hydrological time series forecasting (Sahoo et al., 2019), hydrological drought forecasting (Amanambu et al., 2022), rainfall-runoff modeling (Kratzert et al., 2018), estimation of tidal level (Bai and Xu, 2021), and analysis of groundwater level variations caused by the changes in groundwater withdrawals (Shin et al., 2020).
A sequence-to-sequence (Seq2Seq) model is a type of neural network architecture designed for sequence transduction tasks, where the goal is to convert an input sequence into an output sequence. It was initially developed for natural language processing tasks (Sutskever et al., 2014), but it has since been applied to various other domains, such as machine translation (Cho et al., 2014), video summarization (Ji et al., 2020), rainfall-runoff modeling (Yin et al., 2021a, b), and flood forecasting (Kao et al., 2020). The Seq2Seq model consists of two main components: an encoder and a decoder. The training process involves feeding the input sequence into the encoder, using the generated context vector to initialize the decoder, and then training the model to produce the correct output sequence. In tidal river discharge forecasting, historical runoff data, water levels, and other relevant information can form the input sequence, while predicting the discharge in the upcoming period constitutes the output sequence. Specifically, the encoder of the Seq2Seq model can map past time series data into a fixed-dimensional context vector, which encapsulates crucial information from the recent past. Subsequently, the decoder utilizes this context vector to generate the sequence of future discharge predictions.
The LSTM is widely used in the Seq2Seq model, capable of capturing long-term dependencies in time series. In discharge forecasting, past discharges play an important role in predicting future discharges. With LSTM units, the Seq2Seq model can effectively capture and utilize this long-term dependency. This enables the Seq2Seq-LSTM model to predict discharge trends more accurately, especially with long time series data. By working in tandem with the encoder and decoder, the model captures key information in the sequence and generates accurate outputs corresponding to the inputs, providing reliable and accurate discharge prediction results.
The purpose of this paper is to examine whether the LSTM-based Seq2Seq models can predict satisfactory river discharge estimations at the mouth of great tidal estuaries. Different periods of tidal month cycle (e.g., neap tide, middle tide and spring tide) of discharge data are selected for prediction, which aim to evaluate the ability to capture tidal peak values of different models. In addition, in order to evaluate the sensitivity of the forecast duration of the Seq2Seq models, different durations (e.g., 3 h, 6 h, 24 h, and 48 h) are selected for the tidal discharge estimation. Last but not least, discharge data during various tidal day cycle phases (such as flood tide, ebb tide, and slack water) are also chosen to assess how well the proposed models capture tidal flow variation characteristics. Other conventional models, such as harmonic analysis and particle swarm optimization back propagation (PSO-BP) neural network, will also be supplied and studied, as well as rigid comparisons with the observed data.
In the classic harmonic analysis model, a group of tidal components are used to estimate the tide level. Once the amplitudes and phase lags of these components are determined by least square method, the tide level in a certain gauge is estimated accordingly (Harris et al., 2015). At present, least square method is the most commonly used for harmonic analysis due to the popularization of computer.
In harmonic analysis, a formula summing up the contributions of the tidal constituents is usually used to represent the tidal height. Like the tide level, the mathematical expansion of the discharge Q (Pan et al., 2023) can be written as
$$ Q(t) = {Q_0} + \sum\limits_{i\; =\; 1}^m {{f_i}\;{H_i}\cos \left[ {{\sigma _i}t + {u_i} + \left( {{v_{0i}} - {g_i}} \right)} \right]} + \varepsilon \left( t \right) , $$ | (1) |
where σi, Hi, and gi represent tidal frequency, amplitude and phase lag of i-th tidal constituent, while ui, fi, and v0i mean known nodal angle, nodal factor, astronomical argument of i-th tidal constituent (Foreman and Henry, 1989). Observed discharge, mean discharge and observation error are denoted by Q(t), Q0, and ε(t), respectively. m is the number of resolved tidal constituents which depends on the length of records.
Equation (1) can be simplified further to form an equivalent equation involving the sum of sine and cosine functions. Then the least square method can be used to determine harmonic constants. In this way, the estimated discharge denoted in Eq. (1) can be determined. More detailed procedure can be found in Dennis and Long (1971), Foreman (1977), and Pan et al. (2022). For classic harmonic analysis, tidal constituents can be easily analyzed by “T_TIDE” (Pawlowicz et al., 2002) or “S_TIDE” (Pan et al., 2018) toolbox of Matlab.
Back propagation (BP) train algorithm is often used to train feed-forward neural network (FNN) (Rumelhart et al., 1986). The connection weights are repeatedly changed when training a neural network to achieve the desired error between the net’s actual output and the desired output (Fig. 1). The units of hidden layers can gain regularities in the task by capturing the crucial properties of the input layer through continual modification of their weights.
To avoid duplicating calculations of intermediate terms in the chain rule, the BP train algorithm typically works by computing the gradient of the loss function to each weight by the chain rule, computing the gradient one layer at a time, and iterating backward from the last layer. This method ensures that the error is distributed to all units of each layer. Hence, the biased terms and weight values are updated. Due to its superior learning power, FNN which is trained by BP has been employed for estimates of tidal level and discharge (Lee, 2004; Hidayat et al., 2014).
Particle swarm optimization (PSO) algorithm is a global optimization algorithm (Kennedy and Eberhart, 1995), the algorithm has strong adaptability characteristic such as concise and easy to implement good robustness, and has been applied successfully in many research fields, and has a deep intelligence background (Jain et al., 2022). Combining the PSO with the BP neural network merges the updating mechanism of PSO with the weight adjustment mechanism of Back Propagation. This integration enables the use of PSO to explore and discover more optimal initial weights and biases. Subsequently, the BP algorithm is utilized for meticulous tuning and optimization of these parameters, thereby enhancing the training efficiency and convergence speed of the neural network. The flow chat of PSO-BP algorithm is shown in Fig. 2.
PSO update the velocity and position of the particle according to the following speed update formula:
$$ v_{id}^{k + 1} = \omega v_{id}^k + {c_1}{r_1}\left( {p_{id,{\mathrm{pbest}}}^k - x_{id}^k} \right) + {c_2}{r_2}\left( {g_{id,{\mathrm{gbest}}}^k - x_{id}^k} \right) , $$ | (2) |
where
A unique kind of RNN called long short-term memory networks can learn long-term dependencies. It was first proposed by Hochreiter and Schmidhuber (1997), and numerous people improved and popularized it in subsequent works. Nowadays it is frequently used and performs incredibly well in dealing with the estimation of nonlinear and long-term issues.
The same hyper-parameters to train LSTM models in this investigation as was done inLees et al. (2022). Here, a succinct introduction to the LSTM’s state-space formulation in Kratzert et al. (2019) is provided due to its clear explanation on the cell state (ct), which reflects the state vector of the LSTM.
Hydrological models are often formulated in a state-space-based approach where the states s at a specific time t depend on the input xt, the system states in the previous time step st−1, and the model parameters θi as follows:
$$ {{\boldsymbol{s}}_t} = f\left( {{{\boldsymbol{x}}_t},{{\boldsymbol{s}}_{t - 1}}{\mathrm{;}}{\theta _i}} \right) . $$ | (3) |
Any output yt of a hydrological system (e.g., discharge) can be described as
$$ {{\boldsymbol{y}}_t} = g\left( {{{\boldsymbol{x}}_t},{{\boldsymbol{s}}_{t - 1}};{\theta _j}} \right) , $$ | (4) |
where g(·) is a mapping function that connects the states of the system and the inputs to the system output, and θj is the corresponding subset of model parameters.
Likewise, the LSTM can be formulated as
$$ \left\{ {{{\boldsymbol{c}}_t},{h_t}} \right\} = {f_{{\text{LSTM}}}}\left( {{{\boldsymbol{x}}_t},{{\boldsymbol{c}}_{t - 1}},{{\boldsymbol{h}}_{t - 1}}{\mathrm{;}}{\theta _k}} \right) , $$ | (5) |
where fLSTM(·) symbolizes the LSTM cell at time t that is a function of the current inputs (xt, e.g., water level and flow velocity), previous state and output (ct−1 and hidden state ht−1), parametrized by the network weights θk. The output of the system, formally described in Eq. (4), would in this specific case be given by
$$ {{\boldsymbol{y}}_t} = {f_{{\text{Dense}}}}\left( {{{\boldsymbol{h}}_t}{\mathrm{;}}{\theta _t}} \right) , $$ | (6) |
where yt is the output of a dense layer fDense(·) parametrized by the weights θt, which predicts the river discharge from the hidden state at the end of the input sequence ht.
Unlike classical hydrological models (such as conceptual and physical hydrology models), the LSTM can infer the needed structure or parametrization from data without preconceived assumptions about the nature of the processes. More detailed descriptions of the LSTM can be found in Kratzert et al. (2019), specifically Eqs (1)–(4), located in Section 2. A schematic diagram for the LSTM can be found in Fig. 3.
As shown in Fig. 4, the Seq2Seq model consists of two main components: an encoder and a decoder (Cho et al., 2014), which usually use RNN or LSTM to model the time sequences. In our study, LSTMs are adopted as the encoder and the decoder. The basic principle is that the encoder encodes the input sequence into a state vector a, while the decoder decodes the vector a for predictive output.
The encoder consists of single-layer LSTM cells, which reads the input sequence in chronological order step by step. The hidden state ht at the current time t can be represented as
$$ {{\boldsymbol{h}}_t} = f({{\boldsymbol{x}}_t},{\text{ }}{{\boldsymbol{h}}_{t - 1}},{\text{ }}{{\boldsymbol{c}}_{t - 1}}) , $$ | (7) |
where ht and ht–1 are the hidden states of the encoding layer at time t and t – 1, respectively; xt is the input at time t; ct–1 is the LSTM cell state of the encoding layer at time t – 1. The update mode of the above three vectors is consistent with LSTM. After multiple updates in the encoding layer, the input sequence is ultimately encoded as vector a. Since LSTM has the function of long-term memory, vector a contains the complete information of the input sequence.
The decoder is also composed of single-layer LSTM cells, and the hidden state
$$ {\boldsymbol{h}}{'_1} = f\left( {\boldsymbol{a}} \right) . $$ | (8) |
The hidden state
$$ {\boldsymbol{h}}{'_t} = f\left( {{\boldsymbol{h}}{'_{t - 1}},{\text{ }}{\boldsymbol{c}}{'_{t - 1}}} \right) . $$ | (9) |
Then, the hidden state
$$ {{\boldsymbol{y}}_t} = G\left( {{\boldsymbol{h}}{'_t}} \right). $$ | (10) |
Finally, after multiple decoding, the required discharge sequence is predicted.
The Changjiang River (Yangtze River) Estuary is the largest and most significant river estuary in China. It is renowned across the world for its rich soil and water resources, water transportation, and fisheries. In particular, the river mouth located near the urban area of Shanghai (Fig. 5), which is the financial, trade and shipping center of China. Predictions and forecasts of floods and droughts are crucial for urban inundations, shipping related to the inland coal and petroleum sector, and drinking water supplies in the entire Changjiang River Estuary region (Cai et al., 2023). The Changjiang River discharge varies greatly seasonally and annually (Fig. 6a). According to data collected during 1956 to 2009, the mean annual discharge is 28 310 m3/s (Zhang et al., 2012), and the monthly mean discharge reached a maximum of 66 200 m3/s in July 1983, and a minimum of
Xuliujing Section located near the Shanghai City, and it is about 110 km away from the river mouth with the cross-sectional width of nearly 6 km (Fig. 5). Tides in the Xuliujing Section are semi-diurnal and asymmetrical, with mean and maximum tidal ranges of 2.05 m and 4.62 m, respectively. The average flood and ebb duration of tide are 4.3 h and 8.1 h, while the average flood and ebb duration of tidal current are 3.9 h and 9.1 h, respectively (Zhang et al., 2012). Xuliujing Section is the demarcation point of South and North Branches and it is the starting point of multilevel branching in the Changjiang River Estuary. As an important boundary point for the Changjiang River to enter the ocean, the study of the discharge estimation and prediction at Xuliujing Section becomes very important not only for the estuary scientific research itself, but also for the water conservancy construction, ecological environment protection, flood prevention and rescue.
To obtain the uniform-sample and continuous discharge data at Xuliujing Section, a real-time discharge monitoring system was developed and used (Figs 5 and 6b). Three upward-facing Workhorse Rio Grande Acoustic Doppler Current Profilers (ADCP) manufactured by RD Instruments with 300 kHz were mounted on the river bed of C1, C2, and C3, named fixed-profile ADCP. The tidally-averaged depths of the three fixed profiles are 14.0 m, 43.5 m, and 9.5 m, respectively. In the fixed-profile ADCP measurement, the depth cell length and sampling interval were set at 1.0 m and 0.5 h, respectively. Because of the extensive cross-section and significant flow variations in the Changjiang River Estuary, both the thalweg and shallow water areas experience bidirectional flow conditions (Gan et al., 2024). Therefore, a multi-profile method which considering the non-synchronized current and water surface sloop is adopted at Xuliujing Section. The methods of discharge estimation from continuous fixed-profile ADCP data at Xuliujing Section are detailly reported in Zhao et al. (2016). The three forecast models are calibrated and validated using the discharge time-series obtained from the multi-profile ADCP data.
Three representative tidal periods have been selected for investigating periodic variations of the discharge prediction and accuracy variations of the models with different forecast duration (Table 1). Although these periods do not exactly correspond to specific tidal cycle, they can also be referred as neap tide, spring tide and middle tide, respectively. Meanwhile, short-term, middle-term and long-term forecasts are made for each tidal cycle. The starting and ending time of the datasets are showed in Table 1. For harmonic analysis model, the ending time of training data is November 1, 2020. For BP neural network and LSTM models, the first 80% of each dataset is divided into training sets, and the next 20% one is divided into verification sets for prediction.
Forecast duration (lead time) | Starting and ending time (YYYY/MM/DD) | ||
Neap tide | Spring tide | Middle tide | |
Short term (3 h) | 2020/10/01–2020/11/10 | 2020/10/01–2020/11/18 | 2020/10/01–2020/11/29 |
Middle term (6 h) | 2020/08/01–2020/11/10 | 2020/08/01–2020/11/18 | 2020/08/01–2020/11/29 |
Long term (24 h) | 2020/01/01–2020/11/10 | 2020/01/01–2020/11/18 | 2020/01/01–2020/11/29 |
Note: All datasets are sampled at half-hour intervals. |
The model parameters of the neural network in this study are empirically adjusted according to the evaluation criteria in the actual training process. The numerical range of the flow in the tide-sensing reach is widely distributed, and the difference between the maximum value and the minimum value is large. Therefore, in order to reduce the impact of the flow value distribution on the operation of the neural network and the model and improve the computational efficiency of the model, the data set is normalized before training as follows:
$$ y = \frac{{({y_{\max }} - {y_{\min }}) \times (x - {x_{\min }})}}{{\left( {{x_{\max }} - {x_{\min }}} \right)}} + {y_{\min }}, $$ | (11) |
where x is the input data, y is the normalized output data, the subscript “min” and “max” represents the minimum and maximum values. In this way, all discharge data can be normalized to [–1, 1] through the Eq. (11).
The classic harmonic analysis is implemented using the T_TIDE package (Pawlowicz et al., 2002). For the 1-month data, it extracts 29 standard constituents with the hindcast explains 96.2% of the signal variance. The largest five constituents, M2, S2, M4, MS4, and K1 have amplitudes of
Tide constituent |
Frequency/ (cycle·h–1) |
Amplitude/ (m3·s–1) |
Phase/(°) | Signal-to-noise ratio |
M2 | 216.67 | 220 | ||
S2 | 267.85 | 46 | ||
M4 | 233.32 | 36 | ||
MS4 | 298.97 | 25 | ||
K1 | 24.510 | 330 |
The input parameters of the PSO-BP and LSTM models include flow velocity, water level, and discharge, while the only output parameter is the predicted discharge. PSO-BP encompasses the structural parameters of the BP neural network itself, while the other part comprises the optimization parameters of the PSO algorithm. The number of BP hidden layers is set to 1, the number of neurons in the hidden layer is set to 128, the learning rate is set to 0.01, and the number of iterations is set to 200. The parameters of the PSO algorithm are presented in Table 3. The input length and output length of BP in this study are the same as those of LSTM, i.e., the input length is 24/48/144, and the output length is 6/12/48. The different prediction periods have varying lengths for the input sequences and the output sequences, primarily designed to adapt to the changing characteristics of the discharge and prediction needs across different time ranges. By reasonably adjusting the sequence length, key information can be better captured, leading to improved prediction accuracy.
Parameter | Value |
Swarm size | 30 |
Inertia weight | 0.5 |
Personal learning factor | 4.494 |
Social learning factor | 4.494 |
Maximum velocity | 1.0 |
Number of iterations | 200 |
Fitness function | root mean square error |
The LSTM model in this paper is implemented using the PyTorch framework. The Torch.nn library is used to construct and train the neural network model, Torch.optim is employed for parameter optimization, and the Argparse library is utilized to pass parameters to the script. The NumPy library is used for array processing and manipulation, while the Pandas library is employed to load and process the dataset and perform normalization operations. The LSTM model has a single-layer hidden layer with a total of 128 nodes. Specific parameter settings are detailed in Table 4.
Parameter | Value | ||
Short term | Middle term | Long term | |
Hidden size | 128 | 128 | 128 |
Num layers | 1 | 1 | 1 |
Input dimension | 3 | 3 | / |
Input length | 24 | 48 | 144 |
Output dimension | 1 | 1 | 1 |
Output length | 6 | 12 | 48 |
Learning rate | 0.01 | 0.01 | 0.01 |
Target error | 0.001 | 0.001 | 0.001 |
Batch size | 256 | 256 | 256 |
Epochs | 200 | 200 | 200 |
Regularization | 0.001 | 0.001 | 0.001 |
Activation | ReLU | ReLU | ReLU |
Optimizer | Adam | Adam | Adam |
Note: For different forecast durations, only the length of the input and output length is different. / represents no data. |
For the Seq2Seq model, the input and output parameters are identical to the LSTM model, and it employs the exact same LSTM neural network structure and weighted parameters to process input and output sequences. The shared parameters between the Seq2Seq and LSTM models aim to facilitate information exchange between the encoder and decoder, achieving efficient sequence transformation. The input size of the Seq2Seq model’s encoder is set to 3, and the output size of the encoder is equal to the input size of the decoder, which is 1. This ensures coherence between the encoder and decoder. Other parameter settings are entirely consistent with Table 4.
Many assessment indexes are needed to compare the advantages and disadvantages of various forecast models for handling tidal estimates. When evaluating the correctness of estimated outcomes in literature, the relative standard deviation (RSD) and root mean square error (RMSE) are frequently used. They are defined by
$$ {\mathrm{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i \;=\; 1}^n {{{\left( {{x_i} - {y_i}} \right)}^2}} } , $$ | (12) |
$$ {\mathrm{RSD}} = \frac{{{\mathrm{RMSE}}}}{{\overline {\left| x \right|} }} \times 100{\text{% }} ,$$ | (13) |
where n is total number of samples, xi is the observed value and yi is the estimated value. To distinguish between flood tide and ebb tide, we define the discharge in flood tide as negative, while the discharge in ebb tide as positive.
The data utilized for harmonic analysis (HA) modeling for short-term forecasting is displayed in Table 1 and was gathered from the starting time to 3 h before the finishing time. The first 3 h after the end time are the forecasting window. Meanwhile, the input data of the PSO-BP, LSTM and Seq2Seq models is the first 12 h, and the output is the last 3 h. The results of short-term forecast using the four models with the RMSE and RSD are shown in Fig. 7, and the differences between the predicted values and the measured values are shown in Fig. 8. Following results can be found.
(1) Since the short-term prediction model does not account for the influence of the long-period tidal components, the harmonic analysis model estimated values for the three tidal periods of neap tide, middle tide, and spring tide are all higher than the measured values. This may cause the overestimation in tidal discharge prediction.
(2) The prediction curve of LSTM and Seq2Seq models is closest to the observation curve, which are more consistent with the fluctuation of tidal discharge. In addition, the error distribution of Seq2Seq model shown in Fig. 8 is relatively concentrated. The RMSE and RSD of the Seq2Seq model are the smallest among the three models: the RSD of the Seq2Seq model is about 23%–32% smaller than that of the harmonic analysis model, and about 5%–10% lower than that of the PSO-BP model. In particular, the accuracy of Seq2Seq model is about 2% higher than that of LSTM model. The results indicate that the Seq2Seq model has the best forecast prediction accuracy.
(3) For the three tidal periods, the RMSE and RSD of the four models in the neap tide are both the smallest, and they display minimal estimated errors in Fig. 8. Because of the low discharge and the small fluctuation in the neap tide, these models have the best accuracy in forecasting discharge.
The data for harmonic analysis modeling used for middle-term forecasting were gathered from the starting time to 6 h before the ending time. The first 6 h after the end time are the forecasting window. The first 24 h are the input data for the other three models, and the last 6 h are the output data. The results of middle-term forecast are shown in Figs 9 and 10. Following results can be seen.
(1) With the increase of prediction duration, the prediction accuracy of the three models has decreased. The forecasting accuracy of harmonic analysis model is significantly reduced with the RMSE exceed 36 000 m3/s and RSD reaches to 70.5% in middle tide, and the estimated discharges is significantly greater than the measured ones.
(2) PSO-BP neural network model does not need to consider the linear relation between time and the discharge. It only needs to non-linear fit the input value with the output value to form the internal functional relationship model. Thus, the accuracy in the middle term is a little greater than that in short term, and the prediction performance of the BP neural network model is relatively stable. However, there is still a similar problem like that in the short-term forecast, that is, the prediction error of PSO-BP model is obviously increases in spring tide due to the large tidal range.
(3) When compared to short-term prediction, the Seq2Seq model’s prediction error has somewhat increased. With its RSD being between 30% and 60% lower than that of the harmonic analysis model and between 12% and 20% lower than that of the PSO-BP neural network model, the Seq2Seq model continues to have the highest accuracy among the three tidal periods.
For long-term forecasting, the data used for harmonic analysis modeling collected from the starting time to 24 h before the ending time. The forecasting period is the first 24 h of the end time. Meanwhile, the input data of the other three models is the first 72 h, and the output is the last 24 h. The results of middle-term forecast are shown in Figs 11 and 12. Following results can be seen.
(1) Under the premise that there is enough data for model training, the four models’ prediction curves are all in good agreement with the observation curves, and the RMSE is almost less than 10 000 m3/s. This shows that the four models’ prediction results are reasonably reliable. Especially for the harmonic analysis model, the forecasting accuracy has been greatly improved with sufficient data.
(2) The prediction results of the Seq2Seq model are relatively reliable and the accuracy is still the highest among the four models. The RSD of Seq2Seq model is 6%–20% lower than that of harmonic analysis model, and about 9%–25% lower than that of PSO-BP model.
(3) The RSD of the PSO-BP model is 3% higher than that of HA during spring tide; however, during neap and middle tides, the prediction accuracy of the PSO-BP model significantly improves compared to HA. The reason for this lies in the fact that the tidal range is small during neap and middle tides, and the time series data of discharge also fluctuates less, allowing the PSO-BP method to better capture this change pattern. Additionally, when processing long series data, the PSO-BP method may encounter issues such as vanishing gradients or exploding gradients, which can result in premature convergence and impact the training effectiveness of the model.
The correlation coefficients between estimated values and observed values are also obtained in order to more clearly illustrate the prediction effects of the four different models at various tidal stages (such as flood tide, ebb tide, and slack water). In statistics, Pearson correlation coefficient (PCC) can be expressed as:
$$ \mathrm{PCC}=\frac{\displaystyle\sum\limits_{i\; =\; 1}^n\left(X_i-\overline{X}\right)\left(Y_i-\overline{Y}\right)}{\sqrt{\displaystyle\sum\limits_{i\; =\; 1}^n\left(X_i-\overline{X}\right)^2}\sqrt{\displaystyle\sum\limits_{i\; =\; 1}^n\left(Y_i-\overline{Y}\right)^2}}, $$ | (14) |
where n is the number of data, Xi is the estimated value, Yi is the measured value,
Figure 13 shows the correlation coefficient between the estimated tidal discharge and the measured values using the three different models. Compared with the estimation accuracy (RMSE and RSD) showed in Figs 7, 9 and 11, the PCC values indicate the same conclusion. The Seq2Seq model’s estimated results have the largest PCC whether analyzing at short-, middle-, or long-term data, and they are the model that best matches the observations. Especially, it can be found that Seq2Seq model fit the observed positive maximum, minimum value (near to zero) and negative maximum very well in all the sub-graphs, which corresponding to ebb tide, slack water and flood tide. For hydrological work, precise estimation of the maximum and minimum discharge values in tidal rivers is crucial for flood management and disaster reduction, building water conservation projects, and developing and utilizing water resources. This further demonstrates the Seq2Seq model can play a vital role in hydrological science research and water conservation engineering applications.
In addition, in order to test the stability of Seq2Seq model in discharge prediction, another four Seq2Seq models are trained with the same dataset shown in Table 1. The training duration is set to 6 h, 24 h, 48 h, and 96 h, and the testing duration is set to 3 h, 12 h, 24 h, and 48 h, respectively. The estimated results and corresponding RMSE and RSD are shown in Fig. 14. It is found that they match the observed values for all time lengths. The RMSE is 5 056 m3/s for the 3 h estimation. It then increases to 5 209 m3/s and 5 643 m3/s for 24 h and 48 h estimations in advance. This is mostly caused by the accumulation of estimated errors, but it could be resolved when more sophisticated models and computational techniques are created and enormous amounts of observed data are more effectively managed. In addition, it can be found that the estimation results get little worse as the estimated period becomes longer, with the RSD and PCC vary little in the four cases. In this approach, all of the outcomes predicted by Seq2Seq models are suitable for practical applications, such as Flood Forecast and Alarm System, arrangement of port cargo handling, and salt tide warning.
Long short-term memorynetworks excel at processing and predicting significant events in time series data with relatively long intervals and delays. Meanwhile, Seq2Seq models are adept at capturing long-term dependencies within sequences and can handle variable-length sequences effectively. Therefore, the combination of these two models is particularly effective in capturing critical tidal characteristics such as extreme values (such as maximum ebb and flood flow) in tidal river. In this study, LSTM-based Seq2Seq models are firstly used for forecasting the discharge of the tidal reach in Changjiang River Estuary. A good performance is obtained for predictions with a forecast lead time up to two days. Notably, Seq2Seq models have achieved substantial improvements, reducing the relative standard deviation by approximately 6%–60% and 5%–20% of the relative standard deviation compared to the harmonic analysis models and improved BP neural network models, respectively. Moreover, Seq2Seq models exhibit an impressive robustness against variations in forecast lead times, ranging from as short as 3 h to as long as 48 h. Significantly, they excel at capturing critical characteristic values within the tide cycle, including the maximum ebb tide discharge and maximum flood tide discharge. This exceptional ability underscores the profound significance and potential benefits of Seq2Seq models in both scientific research and practical applications.
Amanambu A C, Mossa J, Chen Yin-Hsuen. 2022. Hydrological drought forecasting using a deep transformer model. Water, 14(22): 3611
|
Anshuka A, Chandra R, Buzacott A J V, et al. 2022. Spatio temporal hydrological extreme forecasting framework using LSTM deep learning model. Stochastic Environmental Research and Risk Assessment, 36(10): 3467–3485
|
Bai Longhu, Xu Hang. 2021. Accurate estimation of tidal level using bidirectional long short-term memory recurrent neural network. Ocean Engineering, 235: 108765
|
Cai Huayang, Li Bo, Garel E, et al. 2023. A data-driven model to quantify the impact of river discharge on tide-river dynamics in the Yangtze River estuary. Journal of Hydrology, 620: 129411
|
Cho K, Van Merriënboer B, Gulcehre C, et al. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha, The State of Qatar: ACL, 1724–1734
|
Dennis R E, Long E E. 1971. A user’s guide to a computer program for harmonic analysis of data at tidal frequencies. NOAA NOS, 41: 3–11
|
Foreman M G G. 1977. Manual for tidal heights analysis and prediction. Pacific Marine Science Report. Sidney, BC, Canada: Institute of Ocean Sciences, Patricia Bay, 77–10
|
Foreman M G G, Henry R F. 1989. The harmonic analysis of tidal model time series. Advances in Water Resources, 12(3): 109–120
|
Gan Min, Chen Yongping, Pan Haidong, et al. 2024. Study on the spatiotemporal variation of the Yangtze estuarine tidal species. Estuarine, Coastal and Shelf Science, 298: 108637
|
Harris D L, Pore N A, Cummings R A. 2015. Tide and tidal current prediction by high speed digital computer. The International Hydrographic Review, 42(1): 95–103
|
Hidayat H, Hoitink A J F, Sassi M G, et al. 2014. Prediction of discharge in a tidal river using artificial neural networks. Journal of Hydrologic Engineering, 19(8): 04014006
|
Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Computation, 9(8): 1735–1780
|
Jain M, Saihjpal V, Singh N, et al. 2022. An overview of variants and advancements of PSO algorithm. Applied Sciences, 12(17): 8392
|
Ji Zhong, Xiong Kailin, Pang Yanwei, et al. 2020. Video summarization with attention-based encoder–decoder networks. IEEE Transactions on Circuits and Systems for Video Technology, 30(6): 1709–1717
|
Kao I-Feng, Zhou Yanlai, Chang Li-Chiu, et al. 2020. Exploring a long short-term memory based encoder-decoder framework for multi-step-ahead flood forecasting. Journal of Hydrology, 583: 124631
|
Kennedy J, Eberhart R. 1995. Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks. Perth: IEEE, 1942–1948
|
Kratzert F, Herrnegger M, Klotz D, et al. 2019. NeuralHydrology—Interpreting LSTMs in hydrology. In: Samek W, Montavon G, Vedaldi A, et al, eds. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Cham: Springer, 347–362
|
Kratzert F, Klotz D, Brenner C, et al. 2018. Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrology and Earth System Sciences, 22(11): 6005–6022
|
Lee T L. 2004. Back-propagation neural network for long-term tidal predictions. Ocean Engineering, 31(2): 225–238
|
Lees T, Reece S, Kratzert F, et al. 2022. Hydrological concept formation inside long short-term memory (LSTM) networks. Hydrology and Earth System Sciences, 26(12): 3079–3101
|
Matte P, Jay D A, Zaron E D. 2013. Adaptation of classical tidal harmonic analysis to nonstationary tides, with application to river tides. Journal of Atmospheric and Oceanic Technology, 30(3): 569–589
|
Olah C. 2015. Understanding LSTM networks. https://colah.github.io/posts/2015-08-Understanding-LSTMs/[2015-08]
|
Pan Haidong, Jiao Shengyi, Xu Tengfei, et al. 2022. Investigation of tidal evolution in the Bohai Sea using the combination of satellite altimeter records and numerical models. Estuarine, Coastal and Shelf Science, 279: 108140
|
Pan Haidong, Lv Xianqing, Wang Yingying, et al. 2018. Exploration of tidal-fluvial interaction in the columbia river estuary using S_TIDE. Journal of Geophysical Research: Oceans, 123(9): 6598–6619
|
Pan Haidong, Xu Tengfei, Wei Zexun. 2023. A modified tidal harmonic analysis model for short-term water level observations. Ocean Modelling, 186: 102251
|
Pawlowicz R, Beardsley B, Lentz S. 2002. Classical tidal harmonic analysis including error estimates in MATLAB using T_TIDE. Computers & Geosciences, 28(8): 929–937
|
Rumelhart D E, Hinton G E, Williams R J. 1986. Learning representations by back-propagating errors. Nature, 323(6088): 533–536
|
Sahoo B B, Jha R, Singh A, et al. 2019. Long short-term memory (LSTM) recurrent neural network for low-flow hydrological time series forecasting. Acta Geophysica, 67(5): 1471–1481
|
Shin M J, Moon S H, Kang K G, et al. 2020. Analysis of groundwater level variations caused by the changes in groundwater withdrawals using long short-term memory network. Hydrology, 7(3): 64
|
Sutskever I, Vinyals O, Le Q V. 2014. Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 3104–3112
|
Yin Hanlin, Guo Zilong, Zhang Xiuwei, et al. 2021a. Runoff predictions in ungauged basins using sequence-to-sequence models. Journal of Hydrology, 603: 126975
|
Yin Hanlin, Zhang Xiuwei, Wang Fandu, et al. 2021b. Rainfall-runoff modeling using LSTM-based multi-state-vector sequence-to-sequence model. Journal of Hydrology, 598: 126378
|
Yuan Xiaohui, Chen Chen, Lei Xiaohui, et al. 2018. Monthly runoff forecasting based on LSTM–ALO model. Stochastic Environmental Research and Risk Assessment, 32(8): 2199–2212
|
Zhang E F, Savenije H H G, Chen S L, et al. 2012. An analytical solution for tidal propagation in the Yangtze Estuary, China. Hydrology and Earth System Sciences, 16(9): 3327–3339
|
Zhang Min, Townend I, Zhou Yunxuan, et al. 2016. Seasonal variation of river and tide energy in the Yangtze Estuary, China. Earth Surface Processes and Landforms, 41(1): 98–116
|
Zhao Jianhu, Chen Zhigao, Zhang Hongmei, et al. 2016. Multiprofile discharge estimation in the tidal reach of Yangtze Estuary. Journal of Hydraulic Engineering, 142(12): 04016056
|
1. | Aparna M. Deulkar, Pradnya R. Dixit, Shreenivas N. Londhe, et al. Comparative Assessment of Artificial Neural Networks (ANNs), Long Short Term Memory Network (LSTM) and Hydrologic Engineering Centre-Hydrologic Modelling System (HEC-HMS) for Runoff Modelling. Water Resources Management, 2025. doi:10.1007/s11269-024-04055-9 |
Forecast duration (lead time) | Starting and ending time (YYYY/MM/DD) | ||
Neap tide | Spring tide | Middle tide | |
Short term (3 h) | 2020/10/01–2020/11/10 | 2020/10/01–2020/11/18 | 2020/10/01–2020/11/29 |
Middle term (6 h) | 2020/08/01–2020/11/10 | 2020/08/01–2020/11/18 | 2020/08/01–2020/11/29 |
Long term (24 h) | 2020/01/01–2020/11/10 | 2020/01/01–2020/11/18 | 2020/01/01–2020/11/29 |
Note: All datasets are sampled at half-hour intervals. |
Tide constituent |
Frequency/ (cycle·h–1) |
Amplitude/ (m3·s–1) |
Phase/(°) | Signal-to-noise ratio |
M2 | 216.67 | 220 | ||
S2 | 267.85 | 46 | ||
M4 | 233.32 | 36 | ||
MS4 | 298.97 | 25 | ||
K1 | 24.510 | 330 |
Parameter | Value |
Swarm size | 30 |
Inertia weight | 0.5 |
Personal learning factor | 4.494 |
Social learning factor | 4.494 |
Maximum velocity | 1.0 |
Number of iterations | 200 |
Fitness function | root mean square error |
Parameter | Value | ||
Short term | Middle term | Long term | |
Hidden size | 128 | 128 | 128 |
Num layers | 1 | 1 | 1 |
Input dimension | 3 | 3 | / |
Input length | 24 | 48 | 144 |
Output dimension | 1 | 1 | 1 |
Output length | 6 | 12 | 48 |
Learning rate | 0.01 | 0.01 | 0.01 |
Target error | 0.001 | 0.001 | 0.001 |
Batch size | 256 | 256 | 256 |
Epochs | 200 | 200 | 200 |
Regularization | 0.001 | 0.001 | 0.001 |
Activation | ReLU | ReLU | ReLU |
Optimizer | Adam | Adam | Adam |
Note: For different forecast durations, only the length of the input and output length is different. / represents no data. |
Forecast duration (lead time) | Starting and ending time (YYYY/MM/DD) | ||
Neap tide | Spring tide | Middle tide | |
Short term (3 h) | 2020/10/01–2020/11/10 | 2020/10/01–2020/11/18 | 2020/10/01–2020/11/29 |
Middle term (6 h) | 2020/08/01–2020/11/10 | 2020/08/01–2020/11/18 | 2020/08/01–2020/11/29 |
Long term (24 h) | 2020/01/01–2020/11/10 | 2020/01/01–2020/11/18 | 2020/01/01–2020/11/29 |
Note: All datasets are sampled at half-hour intervals. |
Tide constituent |
Frequency/ (cycle·h–1) |
Amplitude/ (m3·s–1) |
Phase/(°) | Signal-to-noise ratio |
M2 | 216.67 | 220 | ||
S2 | 267.85 | 46 | ||
M4 | 233.32 | 36 | ||
MS4 | 298.97 | 25 | ||
K1 | 24.510 | 330 |
Parameter | Value |
Swarm size | 30 |
Inertia weight | 0.5 |
Personal learning factor | 4.494 |
Social learning factor | 4.494 |
Maximum velocity | 1.0 |
Number of iterations | 200 |
Fitness function | root mean square error |
Parameter | Value | ||
Short term | Middle term | Long term | |
Hidden size | 128 | 128 | 128 |
Num layers | 1 | 1 | 1 |
Input dimension | 3 | 3 | / |
Input length | 24 | 48 | 144 |
Output dimension | 1 | 1 | 1 |
Output length | 6 | 12 | 48 |
Learning rate | 0.01 | 0.01 | 0.01 |
Target error | 0.001 | 0.001 | 0.001 |
Batch size | 256 | 256 | 256 |
Epochs | 200 | 200 | 200 |
Regularization | 0.001 | 0.001 | 0.001 |
Activation | ReLU | ReLU | ReLU |
Optimizer | Adam | Adam | Adam |
Note: For different forecast durations, only the length of the input and output length is different. / represents no data. |