Document Type : Original Research Paper


1 School of Mining Engineering, University of the Witwatersrand, Johannesburg, South Africa

2 Sibanye-Stillwater Digital Mining Laboratory (DigiMine), Wits Mining Institute (WMI), University of the Witwatersrand, Johannesburg, South Africa

3 School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, South Africa

4 Wits Institute of Data Science, University of the Witwatersrand, Johannesburg, South Africa


The distribution of stream sediments is usually considered as an important and very useful tool for the early-stage exploration of mineralization at the regional scale. The collection of stream samples is not only time-consuming but also very costly. However, the advancements in space remote sensing has made it a suitable alternative for mapping of the geochemical elements using satellite spectral reflectance. In this research work, 407 surface stream sediment samples of the zinc (Zn) and lead (Pb) elements are collected from Central Wales. Five machine learning models, namely the Support Vector Regression (SVR), Generalized Linear Model (GLM), Deep Neural Network (DNN), Decision Tree (DT), and Random Forest (RF) regression, are applied for prediction of the Zn and Pb concentrations using the Sentinel-2 satellite multi-spectral images. The results obtained based on the 10 m spatial resolution show that Zn is best predicted with RF with significant R2 values of 0.74 (p < 0.01) and 0.7 (p < 0.01) during training and testing. However, for Pb, the best prediction is made by SVR with significant R2 values of 0.72 (p < 0.01) and 0.64 (p < 0.01) for training and testing, respectively. Overall, the performance of SVR and RF outperforms the other machine learning models with the highest testing R2 values.


[1]. Yousefi, M., Kamkar-Rouhani, A. and Carranza, E.J.M. (2012). Geochemical mineralization probability index (GMPI): a new approach to generate enhanced stream sediment geochemical evidential map for increasing probability of success in mineral potential mapping. Journal of Geochemical Exploration. 115: 24-35.
[2]. Lin, X., Hu, Y., Meng, G. and Zhang, M. (2020). Geochemical patterns of Cu, Au, Pb and Zn in stream sediments from Tongling of East China: Compositional and geostatistical insights. Journal of Geochemical Exploration, 210, 106457.
[3]. Kirkwood, C., Everett, P., Ferreira, A. and Lister, B. (2016). Stream sediment geochemistry as a tool for enhancing geological understanding: An overview of new data from south west England. Journal of Geochemical Exploration, 163, 28-40.
[4]. Choe, E., van der Meer, F., van Ruitenbeek, F., van der Werff, H., de Smeth, B. and Kim, K.W. (2008). Mapping of heavy metal pollution in stream sediments using combined geochemistry, field spectroscopy, and hyperspectral remote sensing: A case study of the Rodalquilar mining area, SE Spain. Remote Sensing of Environment. 112 (7): 3222-3233.
[5]. Cyples, N.N., Ielpi, A. and Dirszowsky, R.W. (2020). Planform and stratigraphic signature of proximal braided streams: remote-sensing and ground-penetrating-radar analysis of the Kicking Horse River, Canadian Rocky Mountains. Journal of Sedimentary Research, 90(1), 131-149.
[6]. Wang, Q., Li, F., Jiang, X., Wu, S. and Xu, M. (2020). On-stream mineral identification of tailing slurries of tungsten via NIR and XRF data fusion measurement techniques. Analytical Methods, 12(25), 3296-3307.
[7]. Martinez, J.M., Guyot, J.L., Filizola, N. and Sondag, F. (2009). Increase in suspended sediment discharge of the Amazon River assessed by monitoring network and satellite data. Catena, 79(3), 257-264.
[8]. Abedi, M. and Norouzi, G.H. (2016). A general framework of TOPSIS method for integration of airborne geophysics, satellite imagery, geochemical and geological data. International journal of applied earth observation and geoinformation. 46: 31-44.
[9]. Afzal, P., Asl, R.A., Adib, A. and Yasrebi, A.B. (2015). Application of fractal modelling for Cu mineralisation reconnaissance by ASTER multispectral and stream sediment data in Khoshname area, NW Iran. Journal of the Indian Society of Remote Sensing. 43 (1): 121-132.
[10]. Mondini, A.C. (2017). Measures of spatial autocorrelation changes in multitemporal SAR images for event landslides detection. Remote Sensing. 9 (6): 554.
[11]. Yousefi, M. (2017). Analysis of zoning pattern of geochemical indicators for targeting of porphyry-Cu mineralization: a pixel-based mapping approach. Natural Resources Research. 26 (4): 429-441.
[12]. Tehrany, M.S., Jones, S., Shabani, F., Martínez-Álvarez, F. and Bui, D.T. (2019). A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data. Theoretical and Applied Climatology. 137 (1): 637-653.
[13]. Ahmed, N., Firoze, A. and Rahman, R.M. (2020). Machine learning for predicting landslide risk of Rohingya refugee camp infrastructure. Journal of Information and Telecommunication. 4 (2): 175-198.
[14]. Dornan, T., O'Sullivan, G., O'Riain, N., Stueeken, E. and Goodhue, R. (2020). The application of machine learning methods to aggregate geochemistry predicts quarry source location: an example from Ireland. Computers & Geosciences, 140, 104495.
[15]. Coimbra, R., Rodriguez-Galiano, V., Olóriz, F. and Chica-Olmo, M. (2014). Regression trees for modeling geochemical data An application to Late Jurassic carbonates (Ammonitico Rosso). Computers & Geosciences. 73: 198-207.
[16]. Zuo, R. and Xiong, Y. (2018). Big data analytics of identifying geochemical anomalies supported by machine learning methods. Natural Resources Research. 27 (1): 5-13.
[17]. Chen, Y. and Wu, W. (2017). Application of one-class support vector machine to quickly identify multivariate anomalies from geochemical exploration data. Geochemistry: Exploration, Environment, Analysis. 17 (3): 231-238.
[18]. Wang, Z., Zuo, R. and Dong, Y. (2019). Mapping geochemical anomalies through integrating random forest and metric learning methods. Natural Resources Research. 28 (4): 1285-1298.
[19]. Toghill, P. (2011). The geology of Britain: an introduction. Crowood.
[21]. Mahboob, M.A., Celik, T. and Genc, B. (2020). Predictive modeling and comparative evaluation of geostatistical models for geochemical exploration through stream sediments. Arabian Journal of Geosciences. 13 (20): 1-21.
[22]. Chen, W., Pourghasemi, H.R., Kornejady, A. and Zhang, N. (2017). Landslide spatial modeling: Introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geoderma. 305: 314-327.
[23]. Drucker, H., Burges, C.J., Kaufman, L., Smola, A. and Vapnik, V. (1997). Support vector regression machines. Advances in neural information processing systems. 9: 155-161.
[24]. Lim, E.P., Foo, S., Khoo, C., Chen, H., Fox, E., Shalini, U. and Thanos, C. (Eds.). (2002). Digital Libraries: People, Knowledge, and Technology: 5th International Conference on Asian Digital Libraries, ICADL 2002, Singapore, December 11-14, 2002, Proceedings (Vol. 2555). Springer Science & Business Media.
[25]. Okujeni, A., van der Linden, S., Tits, L., Somers, B. and Hostert, P. (2013). Support vector regression and synthetically mixed training data for quantifying urban land cover. Remote Sensing of Environment. 137: 184-197.
[26]. Pozdnoukhov, A. and Kanevski, M. (2007). Multi-scale support vector regression for hotspot detection and modeling.
[27]. Tan, M., Song, X., Yang, X. and Wu, Q. (2015). Support-vector-regression machine technology for total organic carbon content prediction from wireline logs in organic shale: A comparative study. Journal of Natural Gas Science and Engineering. 26: 792-802.
[28]. Miao, F., Wu, Y., Xie, Y. and Li, Y. (2018). Prediction of landslide displacement with step-like behavior based on multialgorithm optimization and a support vector regression model. Landslides. 15 (3): 475-488.
[29]. Nourali, H. and Osanloo, M. (2019). Mining capital cost estimation using Support Vector Regression (SVR). Resources Policy. 62: 527-540.
[30]. X. Ding, M. Hasanipanah, H. N. Rad, and W. Zhou. (2020). "Predicting the blast-induced vibration velocity using a bagged support vector regression optimized with firefly algorithm," Engineering with Computers, pp. 1-12.
[31]. Youssef, A.M., Pourghasemi, H.R., Pourtaghi, Z.S. and Al-Katheeri, M.M. (2016). Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides. 13 (5): 839-856.
[32]. Miller, J. and Franklin, J. (2002). Modeling the distribution of four vegetation alliances using generalized linear models and classification trees with spatial dependence. Ecological Modelling. 157 (2-3): 227-247.
[33]. Hussain, F. and Jeong, J. (2015, March). Exploiting deep neural networks for digital image compression. In 2015 2nd world symposium on web applications and networking (WSWAN) (pp. 1-6). IEEE.
[34]. Gislason, P.O., Benediktsson, J.A. and Sveinsson, J.R. (2006). Random forests for land cover classification. Pattern recognition letters. 27 (4): 294-300.
[35]. K. Fawagreh, M.M. Gaber, and E. Elyan. (2014). "Random forests: from early developments to recent advancements," Systems Science Control Engineering: An Open Access Journal, Vol. 2, No. 1, pp. 602-609.
[36]. Van der Meer, F.D., Van der Werff, H. M.A. and Van Ruitenbeek, F.J.A. (2014). Potential of ESA's Sentinel-2 for geological applications. Remote sensing of environment. 148: 124-133.
[37]. M. Karaman, E. Özelkan, and S. Tasdelen. (2018) "Influence of basin hydrogeology in the detectability of narrow rivers by Sentinel2-A satellite images: A case study in Karamenderes (Çanakkale)," Journal of Natural Hazards Environment, Vol. 4, pp. 140-155.
[38]. Lobo, F.D.L., Souza-Filho, P.W.M., Novo, E.M.L.D.M., Carlos, F.M. and Barbosa, C.C.F. (2018). Mapping mining areas in the brazilian amazon using msi/sentinel-2 imagery (2017). Remote Sensing. 10 (8): 1178.
[39]. Mielke, C., Boesche, N.K., Rogass, C., Segl, K. and Kaufmann, H. (2014, June). Multi-and hyperspectral satellite sensors for mineral exploration, new applications to the sentinel-2 and enmap mission. In Proceedings of the 34th EARSeL Symposium, Poland, Warsaw (pp. 16-20).
[40]. Karim, M., Maanan, M., Maanan, M., Rhinane, H., Rueff, H. and Baidder, L. (2019). Assessment of water body change and sedimentation rate in Moulay Bousselham wetland, Morocco, using geospatial technologies. International journal of sediment research. 34 (1): 65-72.
[41]. Cardoso-Fernandes, J., Lima, A. and Teodoro, A.C. (2018). Potential of Sentinel-2 data in the detection of lithium (Li)-bearing pegmatites: a study case. In Earth resources and environmental remote sensing/GIS applications IX (Vol. 10790, p. 107900T). International Society for Optics and Photonics.
[42]. Piepho, H.P. (2019). A coefficient of determination (R2) for generalized linear mixed models. Biometrical Journal. 61 (4): 860-872.
[43]. Cozzolino, D. and Moron, A. (2004). Exploring the use of near infrared reflectance spectroscopy (NIRS) to predict trace minerals in legumes. Animal Feed Science and Technology. 111 (1-4): 161-173.
[44]. Hauff, P. (2008). An overview of VIS-NIR-SWIR field spectroscopy as applied to precious metals exploration. Spectral International Inc, 80001, 303-403.
[45]. Hunt, G. R. (1977). Spectral signatures of particulate minerals in the visible and near infrared. Geophysics. 42 (3): 501-513.
[46]. Chattoraj, S.L., Sharma, R.U., Kumar, C. and Sengar, V. (2020). Identification and characterization of hydrothermally altered minerals using surface and space-based reflectance spectroscopy, in parts of south-eastern Rajasthan, India. SN Applied Sciences. 2 (4): 1-9.
[47]. Vanhellemont, Q. (2019). Adaptation of the dark spectrum fitting atmospheric correction for aquatic applications of the Landsat and Sentinel-2 archives. Remote Sensing of Environment. 225: 175-192.
[48]. Perez, C.A., Estévez, P.A., Vera, P.A., Castillo, L.E., Aravena, C.M., Schulz, D.A. and Medina, L.E. (2011). Ore grade estimation by feature selection and voting using boundary detection in digital image analysis. International Journal of Mineral Processing. 101 (1-4): 28-36.
[49]. Abbaszadeh, M., Hezarkhani, A. and Soltani-Mohammadi, S. (2013). An SVM-based machine learning method for the separation of alteration zones in Sungun porphyry copper deposit. Geochemistry. 73 (4): 545-554.
[50]. Sheng, L., Zhang, T., Niu, G., Wang, K., Tang, H., Duan, Y. and Li, H. (2015). Classification of iron ores by laser-induced breakdown spectroscopy (LIBS) combined with random forest (RF). Journal of Analytical Atomic Spectrometry. 30 (2): 453-458.
[51]. Freedman, J. (1972). Geochemical prospecting for zinc, lead, copper, and silver, Lancaster Valley, southeastern Pennsylvania (No. 1314). US Government Printing Office.
[52]. Bouabdellah, M. and Sangster, D.F. (2016). Geology, geochemistry, and current genetic models for major Mississippi valley-type Pb–Zn deposits of Morocco. In Mineral Deposits of North Africa (pp. 463-495). Springer, Cham.
[53]. Gao, R., Xue, C., Zhao, X., Chen, X., Li, Z. and Symons, D. (2019). Source and possible leaching process of ore metals in the Uragen sandstone-hosted Zn-Pb deposit, Xinjiang, China: Constraints from lead isotopes and rare earth elements geochemistry. Ore Geology Reviews. 106: 56-78.