Floods are among the most devastating natural disasters, causing substantial economic losses and fatalities worldwide. Enhancing the accuracy of flood forecasting models is crucial for mitigating these impacts and providing early warning systems. However, the performance of these models significantly relies on the length and quality of training data. A lack of sufficient historical flood event data can undermine the ability of these models to provide accurate forecasts. To address this, we employ a Time-Series Generative Adversarial Network (TimeGAN) to synthetically generate flood events, enriching the training dataset both in quantity and quality. TimeGAN is trained using nine features, comprising United States Geological Survey (USGS) discharge and meteorological data from the North American Land Data Assimilation System (NLDAS-2) dataset, to produce synthetic data that includes both the synthetic NLDAS dataset and its corresponding USGS discharge. The augmented dataset, which combines historical and synthetic data, is then used to train a Long Short-Term Memory (LSTM) model to forecast streamflow at various lead times. Additionally, we incorporate wavelet transform (WT) within our model to decompose observed discharge data, identifying trends. The model performance is tested across twenty-four basins in Southeast Texas, focusing on extreme conditions during Hurricane Harvey. Results indicate that data augmentation improves the model’s performance, increasing the average Nash–Sutcliffe Efficiency (NSE) over 24 basins by approximately 10 %, 5 %, 19 %, and 38 % for lead times of up to four days. These findings demonstrate the model’s robustness and applicability in real-world scenarios, highlighting its potential as an effective tool for decision-makers in risk management during extreme events….Read more