GlobalML ∘ Machine learning ∘ Computer vision ∘ Data analysis ∘ Artificial intelligence

Portfoio Details

Home
Portfoio Details

1. MTBF prediction for oilfield equipment

Category: Forecasting systems
Client: Major oil company
Project date: 2017

Design a methodology to forecast the Mean Time Between Failures (MTBF) of downhole equipment and establish a maintenance schedule for well operations. Additionally, develop a system to determine the technical operating limit of equipment within a well. The overall objective is to create a predictive system that estimates the MTBF of equipment based on specific parameters for a defined group.

Distribution function

Objective Formulation

Elements of the input data describing oil wells

Numerical: 'Head', 'Descent depth', 'Production', '% of water (mode)', as well as input data for Flow and Preassure parametr in oil wells.
Categorical (full binarization): 'Reason for previous stop', 'Affiliation', 'Power supply, contractor', 'Pump type', 'Operating mode', 'ESP, manufacturer', 'Protector, manufacturer', 'Submersible electric motor, manufacturer', 'Cable, manufacturer', etc.

Target parameter: MTBF (in days)

Issues affecting prediction: lots of missing data

Initial input processing

Used features of additional tables

Only pump data is used
Numerical: 'Q value', 'Left zone', 'Right zone', 'Rating F', 'Q', 'H', 'Power', 'Efficiency'
Categorical (full binarization): 'VENDOR', 'TYPERU', 'TYPEFGN'.

Used classifier: Catboost

Result after initial processing (SMAPE): 0.734

Two Parameter Weibull Distribution

We used two-parameter Weibull distribution, which describes the time to failure and takes into account coefficient k, which characterizes the change in the failure rate over time.

Adding features of Weibull Distribution by oilfield

For each oilfield's training sample for target attribute MTBF, the parameters of the Weibull distribution are automatically calculated (shape factor, expectation, variance)
The specified parameters are added, depending on the oilfield, into the training and test datasets

Result with Weibull features (SMAPE): 0.732

Result after adjusting catboost parameters (SMAPE): 0.726

2. Predicting failure of electric traction motors in locomotives

Category: Forecasting systems
Client: Industrial company
Project date: 2019

Using Data to Train Predictive Failure Models

Application of predictive failure models

The developed models allow for making the following predictions:

Forecast of probability of failure during an average of 36 hours of continuous locomotive operation 1 day from the date of the forecast using data from 6 days before the date of the forecast
The same, after 7 days from used data for 7 days
The same, after 14 days from used data for 16 days
The same, after 30 days from used data for 30 days

In the process of analyzing the data provided, the following features were found that affect the final results of forecasts

Data fragmentation
Outliers in data
Noises
Sample bias towards failures

Below are the roc-auc metrics for assessing the quality of forecast models:

Forecast in 1 day: roc-auc: 0.69
Forecast in 7 days: roc-auc: 0.65
Forecast in 14 days: roc-auc: 0.63
Forecast in 30 days: roc-auc: 0.59

3. Predicting shutdowns in peroxide-grade polypropylene production

Category: NLP
Client: CIS countries petrochemical company
Project date: 2019

Description: In the production of peroxide-grade polypropylene, the final stage involves cutting the granulate. However, a common issue arises when agglomerates sticks to the knives and clogs the space between the die and the knives. Consequently, the knives gradually move away from the die, leading to a degradation in the process and eventual equipment shutdown, resulting in significant production losses. The degradation process can be indirectly monitored by observing the presence of agglomerates on the vibrating screen. Fortunately, a wealth of telemetry data is available, offering the potential to predict process degradation in advance, typically within an hour. Additionally, extruder downtime data is also recorded, providing valuable insights. We possess a comprehensive set of telemetry data spanning an entire year.

General task: Utilize this data to develop and deploy a predictive system capable of anticipating equipment shutdowns.

As part of the pilot project, we successfully implemented and trained LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) models.

Obtained results (LSTM):

Training Accuracy= 0.97
Test Accuracy= 0.89

Obtained results (GRU):

Accuracy= 1.000
Test Accuracy= 0.87

These results can be further enhanced through continued collaboration with domain specialists and by implementing more advanced data preprocessing techniques.