Prediction of Optimum Water Content of soil using advanced machine learning methods: A comparative study of RF, SVM, and ANN models
Author affiliations
DOI:
https://doi.org/10.15625/2615-9783/23484Keywords:
Optimum Water Content (OWC), advanced machine learning Methods, RF, SVM, ANN, soil compaction, Partial Dependence Plot (PDP)Abstract
In construction, achieving adequate soil compaction is essential for ensuring the strength and stability of geotechnical structures, with Optimum Water Content (OWC) being a critical parameter. Traditional laboratory methods for determining the OWC are accurate but often time-consuming and resource-intensive. This study investigates the potential of advanced machine learning methods: Random Forest (RF), Support Vector Machines (SVM), and Artificial Neural Networks with Multilayer Perceptron (ANN-MLP) to predict the OWC of soil using a curated dataset of over 214 soil samples collected from the Van Don - Mong Cai expressway construction project (Vietnam). The models were developed using input factors such as specific gravity, grain size distribution, organic content, and Atterberg limits. Among the three approaches, the RF model exhibited the best performance (R2 = 0.84, RMSE = 1.07% and MAE = 0.78%) compared with other models such as ANN (MLP) (R2 = 0.44, RMSE = 2.02% and MAE = 1.61%) and SVM (R2 = 0.63, RMSE = 1.65% and MAE = 1.17%). Partial Dependence Plot (PDP) analysis further highlighted fines content, plasticity indices, and organic matter as key influencing factors with a high impact on the predictive capability of the model. The findings demonstrated that the RF model offers an accurate and efficient tool for estimating the OWC of soil, with potential to reduce reliance on extensive laboratory testing and support faster, data-driven geotechnical decision-making.
Downloads
References
Ahmad M.W., Mourshed M., Rezgui Y., 2017. Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy and Buildings, 147, 77–89.
Aragón A., Garcıa M., Filgueira R., Pachepsky Y.A., 2000. Maximum compactibility of Argentine soils from the Proctor test;: The relationship with organic carbon and water content. Soil and Tillage Research, 56, 197–204.
Benbouras M.A., Lefilef L., 2023. Progressive machine learning approaches for predicting the soil compaction parameters. Transportation Infrastructure Geotechnology, 10, 211–238.
Blotz L.R., Benson C.H., Boutwell G.P., 1998. Estimating optimum water content and maximum dry unit weight for compacted clays. Journal of Geotechnical and Geoenvironmental Engineering, 124, 907–912.
Breiman L., 2001. Random forests. Machine learning, 45, 5–32.
Duc N.D., Nguyen M.D., Prakash I., Van H.N., Van Le H., Thai P.B., 2025. Prediction of safety factor for slope stability using machine learning models. Vietnam Journal of Earth Sciences, 47(2), 182–200. https://doi.org/10.15625/2615-9783/22196.
Friedman J.H., 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189–1232.
Khatti J., Grover K.S., 2023. Prediction of compaction parameters of compacted soil using LSSVM, LSTM, LSBoostRF, and ANN. Innovative Infrastructure Solutions, 8, 76.
Kim J.-H., 2024. A study on the water content in distribution pole transformer using random forest model. Computers and Electrical Engineering, 120, 109823.
Lai X., Zhu Q., Zhou Z., Liao K., 2017. Influences of sampling size and pattern on the uncertainty of correlation estimation between soil water content and its influencing factors. Journal of Hydrology, 555, 41–50.
Li B., You Z., Ni K., Wang Y., 2024. Prediction of Soil Compaction parameters using machine learning models. Applied Sciences, 14, 2716.
Liu G., Tian S., Xu G., Zhang C., Cai M., 2023. Combination of effective color information and machine learning for rapid prediction of soil water content. Journal of Rock Mechanics and Geotechnical Engineering, 15, 2441–2457.
Mueller L., Schindler U., Fausey N.R., Lal R., 2003. Comparison of methods for estimating maximum soil water content for optimum workability. Soil and Tillage Research, 72, 9–20.
Navidi M.N., Seyedmohammadi J., Seyed Jalali S.A., 2022. Predicting soil water content using support vector machines improved by meta-heuristic algorithms and remotely sensed data. Geomechanics and Geoengineering, 17, 712–726.
Ngo T.Q., Nguyen L.Q., Tran V.Q., 2022. Predicting tensile strength of cemented paste backfill with aid of second order polynomial regression. Journal of Science and Transport Technology, 43–51.
Nguyen D.D., Nguyen H.P., Vu D.Q., Prakash I., Pham B.T., 2023. Using GA-ANFIS machine learning model for forecasting the load bearing capacity of driven piles. Journal of Science and Transport Technology, 3, 26–33.
Nguyen H.D., Pham V.T., Nguyen Q.-H., Bui Q.-T., 2025. Soil salinity prediction using satellite-based variables and machine learning: Case study in Tra Vinh province, Mekong Delta, Vietnam. Vietnam Journal of Earth Sciences, 47(2), 201–219. https://doi.org/10.15625/2615-9783/22438.
Nguyen Q.H., Ly H.-B., Ho L.S., Al-Ansari N., Le H.V., Tran V.Q., Prakash I., Pham B.T., 2021. Influence of data splitting on performance of machine learning models in prediction of shear strength of soil. Mathematical Problems in Engineering, 4832864.
Nguyen T.T., Nguyen D.D., Nguyen S.D., Prakash I., Van Tran P., Pham B.T., 2022. Forecasting construction price index using artificial intelligence models: support vector machines and radial basis function neural network. Journal of Science and Transport Technology, 9–19.
Nhat V.H., Trinh P.T., Cam L.V., Dieu B.T., Van Hiep L., Prakash I., Anh N.N., Van Hong N., Thanh N.D., Thao N.P., 2025. Mapping Cadmium Contamination Potential in Surface Soil for Civil Engineering Applications: A Comparative Study of Machine Learning and Deep Learning Models in the Gianh River Basin, Vietnam. Journal of Science and Transport Technology, 48–70.
Pal S., Hieu V.T., Nguyen D.D., Vu D.Q., Prakash I., 2024. Investigation of support vector machines with different kernel functions for the prediction of compressive strength of concrete. Journal of Science and Transport Technology, 55–68.
Pham B.T., Amiri M., Nguyen M.D., Ngo T.Q., Nguyen K.T., Tran H.T., Vu H., Anh B.T.Q., Van Le H., Prakash I., 2021. Estimation of shear strength parameters of soil using Optimized Inference Intelligence System. Vietnam Journal of Earth Sciences, 43(2), 189–198. https://doi.org/10.15625/2615-9783/15926.
Pham B.T., Nguyen M.D., Bui K.-T.T., Prakash I., Chapi K., Bui D.T., 2019. A novel artificial intelligence approach based on Multilayer Perceptron Neural Network and Biogeography-based Optimization for predicting the coefficient of consolidation of soil. Catena, 173, 302–311.
Pham T.A., 2024. Developing a Machine Learning Model for Predicting the Settlement of Bored Piles. Journal of Science and Transport Technology, 95–109.
Phan V.-H., Ly H.-B., 2024. RIME-RF-RIME: A novel machine learning approach with SHAP analysis for predicting macroscopic permeability of porous media. Journal of Science and Transport Technology, 58–71.
Phung B.-N., Le T.-H., Nguyen M.-K., Nguyen T.-A., Ly H.-B., 2023. Practical numerical tool for marshall stability prediction based on machine learning: an application for asphalt concrete containing basalt fiber. Journal of Science and Transport Technology, 26–43.
Prakash I., Kumar R., Nguyen T.-A., Vu P.T., 2022. Development of effective XGB model to predict the Axial Load Capacity of circular CFST columns. Journal of Science and Transport Technology, 26–42.
Prakash I., Nguyen D.D., Tuan N.T., Van Phong T., Van Hiep L., 2024. Landslide susceptibility zoning: integrating multiple Intelligent models with SHAP Analysis. Journal of Science and Transport Technology, 23–41.
Taffese W.Z., Abegaz K.A., 2022. Prediction of compaction and strength properties of amended soil using machine learning. Buildings, 12, 613.
Taylor K.E., 2001. Summarizing multiple aspects of model performance in a single diagram. Journal of geophysical research: atmospheres 106, 7183–7192.
Vapnik V., 1995. The Nature of Statistical Learning Theory. Springer Verlag, New York, 1–188.
Vapnik V., 2013. The nature of statistical learning theory. Springer science & business media.
Wu Y.-C., Feng J.-W., 2018. Development and application of artificial neural network. Wireless Personal Communications, 102, 1645–1656.
Yin D., Wang Y., Huang Y., 2023. Predicting soil moisture content of tea plantation using support vector machine optimized by arithmetic optimization algorithm. Journal of Algorithms & Computational Technology 17, 17483026221151198.
Zhou J., Zhang Y., Li C., Yong W., Qiu Y., Du K., Wang S., 2023. Enhancing the performance of tunnel water inflow prediction using Random Forest optimized by Grey Wolf Optimizer. Earth Science Informatics, 16, 2405–2420.
