Clasificación de la calidad del agua subterránea utilizada en la agricultura mediante aprendizaje automático con datos desbalanceados
Classification of the quality of groundwater used in agriculture, using machine learning in unbalanced dataContenido principal del artículo
En zonas costeras áridas donde la sobreexplotación y la intrusión marina amenazan tanto la calidad como la disponibilidad del agua subterránea, esta es fundamental para la agricultura. Se utilizaron índices hidroquímicos y modelos de aprendizaje automático (AA) para determinar si el agua subterránea del acuífero Caplina (sur de Perú) era apta para el riego. El Índice de Calidad del Agua de Riego (ICAR) se calculó utilizando siete iones principales: Na⁺, Ca²⁺, Mg²⁺, K⁺, Cl⁻, SO₄²⁻ y HCO₃⁻. Este índice también se empleó para entrenar algoritmos de clasificación supervisada en un esquema binario (zonas críticas y no críticas). El clasificador XGBoost resultó ser el mejor modelo de los evaluados, con una puntuación F1 de 0,897, un área bajo la curva ROC (AUC-ROC) de 0,968 y una precisión de 0,927 mediante validación cruzada dejando uno fuera (n = 41). El análisis de sensibilidad reveló que los predictores más efectivos fueron Na⁺, Ca²⁺, Mg²⁺ y K⁺. Esto significa que el intercambio iónico y la congelación del agua ocurrieron simultáneamente. La precisión del modelo se mantuvo estable, ya que el número de predicciones disminuyó de siete a cuatro iones, mientras que el costo del monitoreo se redujo hasta en un 43 %. Esto demuestra que la red hidrológica puede mejorarse. Una combinación de modelado basado en datos y análisis geoquímico reveló indicios tempranos de salinización del agua en la zona costera del estuario. Este resultado demuestra la eficacia de los enfoques basados en aprendizaje automático (ML) como sistemas de alerta temprana para la exploración de aguas subterráneas en áreas con recursos limitados.
In dry coastal areas where overuse and marine intrusion threaten both quality and availability, groundwater is very important for farming. We used hydrochemical indices and machine learning (ML) models to find out if the groundwater in the Caplina aquifer (southern Peru) was good for irrigation. The Irrigation Water Quality Index (IWQI) was calculated using seven major ions: Na⁺, Ca²⁺, Mg²⁺, K⁺, Cl⁻, SO₄²⁻, and HCO₃⁻. It was also used to train supervised classification algorithms in a binary scheme (critical and non-critical zones). The XGBoost classifier was the best overall model tested, with an F1 score of 0.897, a ROC-AUC score of 0.968, and an accuracy score of 0.927 via leave-one-out cross-validation (n = 41). Sensitivity analysis revealed that the most effective predictors were Na⁺, Ca²⁺, Mg²⁺, and K⁺. This means that ion exchange and water freezing occurred simultaneously. The model accuracy remained stable, as the number of predictions decreased from seven to four ions, while the monitoring cost was reduced by up to 43%. This shows that the hydrological network can be improved. A combination of data-driven modeling and geochemical analysis revealed early signs of water salinization in the coastal zone of the estuary. This result demonstrates the effectiveness of machine learning (ML)-based approaches as early warning systems for groundwater exploitation in resource-limited areas.
Descargas
Detalles del artículo
Siebert S, Burke J, Faures J, Frenken K, Hoogeveen J, Döll P, Portmann F. Groundwater use for irrigation – a global inventory. Hydrology and Earth System Sciences. 2010; 14(10): 1863–1880. https://doi.org/10.5194/hess-14-1863-2010
Ayari J, Ouelhazi H, Charef A, Barhoumi A. Delineation of seawater intrusion and groundwater quality assessment in coastal aquifers: The Korba coastal aquifer (Northeastern Tunisia). Marine Pollution Bulletin. 2023; 188: 114643. https://doi.org/10.1016/j.marpolbul.2023.114643
Dao P, et al. The impacts of climate change on groundwater quality: A review. Science of the Total Environment. 2024; 912: 169241. https://doi.org/10.1016/j.scitotenv.2023.169241
Davamani V, John J, Poornachandhra C, Gopalakrishnan B, Arulmani S, Parameswari E, Naidu R. A critical review of climate change impacts on groundwater resources: Current status, future possibilities, and role of simulation models. Atmosphere. 2024; 15(1): 122. https://doi.org/10.3390/atmos15010122
Anyango W, Bhowmick D, Bhattacharya N. A critical review of irrigation water quality index and water quality management practices in micro-irrigation for efficient policy making. Desalination and Water Treatment. 2024; 318: 100304. https://doi.org/10.1016/j.dwt.2024.100304
Abadi H, Alemayehu T, Berhe A. Assessing the suitability of water for irrigation purposes using irrigation water quality indices in the Irob catchment, Tigray, Northern Ethiopia. Water Quality Research Journal. 2025; 60(1): 177–195. https://doi.org/10.2166/WQRJ.2024.055
Masoud M, El Osta M, Alqarawy A, Elsayed S, Gad M. Evaluation of groundwater quality for agriculture under different conditions using water quality indices, partial least squares regression models, and GIS approaches. Applied Water Science. 2022; 12(10): 1–22. https://doi.org/10.1007/s13201-022-01770-9
Ibrahim H, Yaseen M, Scholz M, Ali M, Ga M, Elsayed S, Khadr M, Hussein H, Ibrahim H, Eid H, Kovács A, Péter S, Khalifa M. Evaluation and prediction of groundwater quality for irrigation using integrated water quality indices, machine learning models and GIS approaches: A representative case study. Water. 2023; 15(4): 694. https://doi.org/10.3390/w15040694
Abu S, Ismael S, El-Sabri A, Abdo S, Farhat I. Integrated machine learning–based model and WQI for groundwater quality assessment: ML, geospatial, and hydro-index approaches. Environmental Science and Pollution Research. 2023; 30(18): 53862. https://doi.org/10.1007/s11356-023-25938-1
Basharat U, Zhang W, Han C, Khan S, Abbasi A, Mahroof S, Li S. Optimizing machine learning methods for groundwater quality prediction: Case study in District Bagh, Azad Kashmir, Pakistan. Ecotoxicology and Environmental Safety. 2025; 302: 118610. https://doi.org/10.1016/j.ecoenv.2025.118610
Rudrani A, Madhnure P, Yadav A, Kumari B, Roy A, Kumar V, Dauji S, Tirumalesh K. Machine learning approaches for predicting water quality towards climate-resilient groundwater management in southern India. Hydrology Research. 2025; 56(8): 754–773. https://doi.org/10.2166/nh.2025.042
Chucuya S, Vera A, Pino-Vargas E, Steenken A, Mahlknecht J, Montalván I. Hydrogeochemical characterization and identification of factors influencing groundwater quality in coastal aquifers, case: La Yarada, Tacna, Peru. International Journal of Environmental Research and Public Health. 2022; 19(5): 2815. https://doi.org/10.3390/ijerph19052815
Narvaez-Montoya C, Torres-Martínez J, Pino-Vargas E, Cabrera-Olivera F, Loge F, Mahlknecht J. Predicting adverse scenarios for a transboundary coastal aquifer system in the Atacama Desert (Peru/Chile). Science of the Total Environment. 2022; 806: 150386. https://doi.org/10.1016/j.scitotenv.2021.150386
Pino-Vargas E, Espinoza-Molina J, Chávarri-Velarde E, Quille-Mamani J, Ingol-Blanco E. Impacts of groundwater management policies in the Caplina aquifer, Atacama Desert. Water. 2023; 15(14): 2610. https://doi.org/10.3390/w15142610
Vera A, Pino-Vargas E, Verma P, Chucuya S, Chávarri E, Canales M, Torres-Martínez A, Mora A, Mahlknecht J. Hydrodynamics, Hydrochemistry, and Stable Isotope Geochemistry to Assess Temporal Behavior of Seawater Intrusion in the La Yarada Aquifer in the Vicinity of Atacama Desert, Tacna, Peru. Water 2021. 2021; 13(22): 3161. https://doi.org/10.3390/W13223161
González-Domínguez, J., Mora, A., Chucuya, S., Pino-Vargas, E., Torres-Martínez, J. A., Dueñas-Moreno, J., Ramos-Fernández, L., Kumar, M., & Mahlknecht, J. (2024). Hydraulic recharge and element dynamics during salinization in an overexploited coastal aquifer of the world's driest zone: Atacama Desert. Science of the Total Environment, 954, 176204. https://doi.org/10.1016/j.scitotenv.2024.176204
Parkhurst, D. L., & Appelo, C. A. J. (2013). Description of input and examples for PHREEQC version 3. U.S. Geological Survey Techniques and Methods, book 6, chap. A43.
El Tahlawi, M. R., Abo-El Kassem, M., Baghdadi, G. Y., & Saleem, H. A. (2016). Estimating and plotting of groundwater quality using WQIUA and GIS in Assiut Governorate, Egypt. World Journal of Engineering and Technology, 4(1), 59–70. https://doi.org/10.4236/wjet.2016.41007
Wilcox, L. V. (1955). Classification and use of irrigation waters. U.S. Department of Agriculture, Circular No. 969.
Ayers, R. S., & Westcot, D. W. (1985). Water quality for agriculture. FAO Irrigation and Drainage Paper 29. Food and Agriculture Organization, Rome.
Richards, L. A. (1954). Diagnosis and improvement of saline and alkali soils (Vol. 60). U.S. Department of Agriculture, Agriculture Handbook No. 60.
Doneen, L. D. (1964). Notes on water quality in agriculture. Department of Water Science and Engineering, University of California, Davis.
Raghunath, H. M. (1987). Ground Water (2nd ed.). Wiley Eastern Ltd., New Delhi.
Kelly, W. P. (1951). Alkali soils: Their formation, properties and reclamation. Reinhold Publishing Corporation, New York.
Rice, E. W., Baird, R. B., & Eaton, A. D. (Eds.). (2017). Standard methods for the examination of water and wastewater (23rd ed.). APHA; AWWA; WEF.
Piper, A. M. (1944). A graphic procedure in the geochemical interpretation of water analyses. Eos, Transactions American Geophysical Union, 25(6), 914–928. https://doi.org/10.1029/TR025i006p00914
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
Chen, W., Xu, D., Pan, B., Zhao, Y., & Song, Y. (2024). Machine learning-based water quality classification assessment. Water, 16(20), 2951. https://doi.org/10.3390/w16202951
Güler, C., Thyne, G. D., McCray, J. E., & Turner, A. K. (2002). Evaluation of graphical and multivariate statistical methods for classification of water chemistry data. Hydrogeology Journal, 10(4), 455–474. https://doi.org/10.1007/s10040-002-0196-6
Hussein, E. E., Derdour, A., Zerouali, B., Almaliki, A., Wong, Y. J., Ballesta-de los Santos, M., Minh Ngoc, P., Hashim, M. A., & Elbeltagi, A. (2024). Groundwater quality assessment and irrigation water quality index prediction using machine learning algorithms. Water, 16(2), 264. https://doi.org/10.3390/w16020264
Gad, M., Saleh, A. H., Hussein, H., Elsayed, S., & Farouk, M. (2023). Water quality evaluation and prediction using irrigation indices, artificial neural networks, and partial least square regression models for the Nile River, Egypt. Water, 15(12), 2244. https://doi.org/10.3390/w15122244
Karthick, K., Krishnan, S., & Manikandan, R. (2024). Water quality prediction: A data-driven approach exploiting advanced machine learning algorithms with data augmentation. Journal of Water and Climate Change, 15(2), 431–452. https://doi.org/10.2166/wcc.2023.403
Nasaruddin, N., Masseran, N., Idris, W. M. R., & Ul-Saufie, A. Z. (2025). A SMOTE PCA HDBSCAN approach for enhancing water quality classification in imbalanced datasets. Scientific Reports, 15(1), 1–12. https://doi.org/10.1038/s41598-025-97248-0
Saito, T., & Rehmsmeier, M. (2015). The precision–recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432
Irwan, D., Ibrahim, S. L., Latif, S. D., Winston, C. A., Ahmed, A. N., Sherif, M., El-Shafie, A. H., & El-Shafie, A. (2025). River water quality monitoring using machine learning with multiple possible in-situ scenarios. Environmental and Sustainability Indicators, 26, 100620. https://doi.org/10.1016/j.indic.2025.100620
Chadha, D. K. (1999). A proposed new diagram for geochemical classification of natural waters and interpretation of chemical data. Hydrogeology Journal, 7(5), 431–439. https://doi.org/10.1007/s100400050216
Langelier, W. F., & Ludwig, H. F. (1942). Graphical methods for indicating the mineral character of natural waters. Journal AWWA, 34(3), 335–352. https://doi.org/10.1002/j.1551-8833.1942.tb19682.x
Teng, W. C., Fong, K. L., Shenkar, D., Wilson, J. A., & Foo, D. C. Y. (2016). Piper diagram—A novel visualization tool for process design. Chemical Engineering Research and Design, 112, 132–145. https://doi.org/10.1016/j.cherd.2016.06.002
Vasilache, N., Vasile, G. G., Diacu, E., Modrogan, C., Paun, I. C., & Pirvu, F. (2020). Groundwater quality assessment for drinking and irrigation purpose using GIS, Piper diagram, and water quality index. Romanian Journal of Ecology & Environmental Chemistry, 2(2), 109–117. https://doi.org/10.21698/RJEEC.2020.214
Moreno Merino, L., Aguilera, H., González-Jiménez, M., & Díaz-Losada, E. (2021). D-Piper, a modified Piper diagram to represent big sets of hydrochemical analyses. Environmental Modelling & Software, 138, 104979. https://doi.org/10.1016/j.envsoft.2021.104979