Machine learning in geology

2019-07-29T13:22:59+00:00 July 29th, 2019|Mining in Focus|

In recent years there has been a great deal of excitement in geological and mining circles regarding machine learning or deep learning, as it is also known. Nicolaas C. Steenkamp and Breton Scott investigate this phenomenon.

Over the past few years there has been an increasing interest in Machine Learning (ML), or deep learning, especially among exploration geologists. Due to the large volume of complex information, it has become challenging to understand if, and where, this tool fits into consultation businesses or mining operations.

Machine Learning (ML) has been around since the 1980s, but the commercial use has only gained traction in the last four years due to affordable, high performing Graphical Processing Units (GPUs). Also, easy access to cloud computing allows for the capability of complex computations that are fast and cheap enough for anyone outside of a research organisation to use.

Poor quality and blurry core tray photos will be useless for any image classification and consequently it would be a waste of time. Image credit: Leon Louw

Poor quality and blurry core tray photos will be useless for any image classification and consequently it would be a waste of time. Image credit: Leon Louw

ML has also become one of the main topics when discussing the fourth industrial revolution. Concerns have, however, also been expressed that this will contribute to even further job losses in the mining sector due to the shift towards automation and mechanisation.

Why the hype?

The question remains, is there substance to the hype? According to the Gartner Hype Cycle, ML is still in the ‘peak of inflated expectations’. It’s currently on the downward trend heading towards to the ‘trough of disillusionment’ and is not expected to reach the ‘plateau of productivity’ for another two to five years. It should also be noted that ML is not the same as Artificial Intelligence (AI). ML is only focused on data processing techniques that solve a specific problem for an AI or human. AI is a combination of different techniques, including ML, to solve complex/multifaceted problems that only humans can do today.

The main considerations when undertaking an ML-based geological project is to consider the data you have available and if it is the right type, representativeness or quality to be able to solve the problem. For example, poor quality and blurry core tray photos will be useless for any image classification and consequently it would be a waste of time. The old saying, garbage in, garbage out holds true for ML.

Care should also be taken to select the correct method, as some techniques are only suitable for particular types of data.

Tabled data is referred to as ‘structured data’ in this type of application. The most common geoscience examples would be core logs, multi element assays and other geochemistry results from assays, grab samples and stockpile samples captured in the form of excel spreadsheets or in SQL database platforms. Other types of applicable geological data include LIDAR point clouds, hyperspectral images and seismic sections, to name a few.

Along with predicting the presence of metals in rocks, physical properties combined with machine learning have the potential to classify lithologies, characterise hydrothermal alteration, and estimate exploration vectors and geotechnical information in the drill core. Image credit: IMD library

Along with predicting the presence of metals in rocks, physical properties combined with machine learning have the potential to classify lithologies, characterise hydrothermal alteration, and estimate exploration vectors and geotechnical information in the drill core. Image credit: IMD library

Common challenges for ML database development are mainly related to source and format. Data used tends to be at multiple resolutions of space and time, with varying degrees of noise, incompleteness, and uncertainties. The process is also based on a number of assumptions, such as that variables are independent and identically distributed. A prime example is in the field of structural geophysics, where variables are structurally related to each other in the context of space and time, unless there is a discontinuity, such as a fault, across which autocorrelation ceases to persist. Cognisance of the spatio-temporal autocorrelation in geoscience data collected in continuous media is crucial for the effective modelling of geophysical phenomena.

A holistic interpretation

ML has the ability to link, combine and process different types of data together, making it easier to synthesise a holistic interpretation. A simple application of ML can be done by importing photographs of a core tray or a face in mining. The ML will then attempt to crop out core from a tray or read core blocks.

In the case of the face photograph, ML will attempt domaining ore versus waste zones. In a supervised learning approach, the operator will instruct the ML model what the most likely solution is using training and validation datasets. The operator can assess the accuracy of the solution to improve the model in a variety of ways to increase the prediction accuracy. A simple improvement may involve examining the labelled data for any mistakes and changing the label. The main challenge with supervised learning is small sample size initially, and a lack of established standards being applied.

There are two main approaches: the first is simple neural networks, and the second is deep learning networks. A simple way to explain the difference is that the simple neural network approach will entail trying to do many different computations simultaneously. The deep learning network, on the other hand, breaks up the computations into separate steps with each layer learning something different based on the output of the previous layer. This hierarchy of representations seems to enable deep learning to predict better on new data than the simple neural network.

Another approach is ‘deep fakes’, which utilises a double network system. The generator tries to fool another network, the discriminator. The discriminator penalises the generator for obviously fake information and so the generator learns to produce better and better fake information in each cycle. This application is currently being developed extensively by the oil and gas industry to evaluate seismic sections and velocity models.

Other industry applications currently in development deal with:

  • Classifying various characteristics using spatial data obtained via GIS systems;
  • The use of neural networks/SVM in analysing temporal signals like those of seismometers to predict phenomenon such as earthquakes and tsunamis;
  • Applications for landslide prediction using seismic data;
  • Several ML algorithms such as decision trees and neural networks have worked well in mineral exploration using remote sensing data; and
  • Subsurface characterisation using various acoustic signals also use some form of ML for specific problems which involve detecting types of minerals, various types of folds and fracturing.

Case study

Technology will have a significant impact on the way exploration will be done in future. Image credit: Tanzanian Royalty

Technology will have a significant impact on the way exploration will be done in future. Image credit: Tanzanian Royalty

In comparison to highly visible mineralisation, such as massive, semi-massive and disseminated mineralisation (for example, base metal mineralisation), trace mineralisation is harder to develop an ML process for. A project was undertaken to identify gold mineralisation in core utilising geophysical results. Gold distribution is not homogeneous in drill core, and it is subject to a high local variability (nugget effect), which makes ore bodies modelling difficult. The presence of gold in rocks is usually associated with specific rock formations (for example, banded iron formation or intrusive rocks), alteration, and the presence of veins, information on rock composition is critical to the prediction of gold mineralisation.

The input data was derived from neutron activation and natural gamma measurements. The team used a hand-held XRF to measure the variability of the major elements. Six machine learning algorithms were used to predict the presence of mineralisation. Results indicated that the integration of a set of rock physical properties measured at closely spaced intervals along the drill core with ensemble machine learning algorithms allows the detection of gold-bearing intervals with an adequate rate of success. The use of this type of tool in the future will help geologists in selecting sound intervals for assay sampling, which in turn could potentially increase the reserve and in modelling more continuous ore bodies during the entire life of a mine.

Along with predicting the presence of metals in rocks, physical properties combined with machine learning have the potential to classify lithologies, characterise hydrothermal alteration, and estimate exploration vectors and geotechnical information in the drill core. The success rate of predictions will increase as increasing amounts of data are collected. This method should be applied from the very beginning of the exploration stage (which means starting from the discovery hole) so that the initial model can be trained and continuously updated with new drill holes. 

About the Authors

Breton Scott has over two decades of post-qualification experience in the mining and project engineering industry. He has been involved in a variety of activities ranging from mine operations, project management, mining and rock engineering, mineral asset valuations, due diligences, EPCM contracts and related feasibility studies.

Nicolaas C. Steenkamp has a decade and a half of post-qualification experience in the geological and geotechnical industry. He has been involved in a variety of activities ranging for exploration, geochemistry, geological and geotechnical, desktop studies, due diligence, EPCM contracts and related feasibility studies.

Bowline Professional Services offers a wide range of geological, mining and industry related services.