1. MTBF prediction for oilfield equipment
- Category: Forecasting systems
- Client: Major oil company
- Project date: 2017
Design a methodology to forecast the Mean Time Between Failures (MTBF) of downhole equipment and establish a maintenance schedule for well operations. Additionally, develop a system to determine the technical operating limit of equipment within a well. The overall objective is to create a predictive system that estimates the MTBF of equipment based on specific parameters for a defined group.
Distribution function
Objective Formulation
Elements of the input data describing oil wells
- Numerical: 'Head', 'Descent depth', 'Production', '% of water (mode)', as well as input data for Flow and Preassure parametr in oil wells.
- Categorical (full binarization): 'Reason for previous stop', 'Affiliation', 'Power supply, contractor', 'Pump type', 'Operating mode', 'ESP, manufacturer', 'Protector, manufacturer', 'Submersible electric motor, manufacturer', 'Cable, manufacturer', etc.
Target parameter: MTBF (in days)
Issues affecting prediction: lots of missing data
Initial input processing
Used features of additional tables
- Only pump data is used
- Numerical: 'Q value', 'Left zone', 'Right zone', 'Rating F', 'Q', 'H', 'Power', 'Efficiency'
- Categorical (full binarization): 'VENDOR', 'TYPERU', 'TYPEFGN'.
Used classifier: Catboost
Result after initial processing (SMAPE): 0.734
Two Parameter Weibull Distribution
We used two-parameter Weibull distribution, which describes the time to failure and takes into account coefficient k, which characterizes the change in the failure rate over time.
Adding features of Weibull Distribution by oilfield
- For each oilfield's training sample for target attribute MTBF, the parameters of the Weibull distribution are automatically calculated (shape factor, expectation, variance)
- The specified parameters are added, depending on the oilfield, into the training and test datasets
Result with Weibull features (SMAPE): 0.732
Result after adjusting catboost parameters (SMAPE): 0.726
2. Predicting failure of electric traction motors in locomotives
- Category: Forecasting systems
- Client: Industrial company
- Project date: 2019
Using Data to Train Predictive Failure Models
Application of predictive failure models
The developed models allow for making the following predictions:
- Forecast of probability of failure during an average of 36 hours of continuous locomotive operation 1 day from the date of the forecast using data from 6 days before the date of the forecast
- The same, after 7 days from used data for 7 days
- The same, after 14 days from used data for 16 days
- The same, after 30 days from used data for 30 days
In the process of analyzing the data provided, the following features were found that affect the final results of forecasts
- Data fragmentation
- Outliers in data
- Noises
- Sample bias towards failures
Below are the roc-auc metrics for assessing the quality of forecast models:
- Forecast in 1 day: roc-auc: 0.69
- Forecast in 7 days: roc-auc: 0.65
- Forecast in 14 days: roc-auc: 0.63
- Forecast in 30 days: roc-auc: 0.59
3. Predicting shutdowns in peroxide-grade polypropylene production
- Category: NLP
- Client: CIS countries petrochemical company
- Project date: 2019
Description: In the production of peroxide-grade polypropylene, the final stage involves cutting the granulate. However, a common issue arises when agglomerates sticks to the knives and clogs the space between the die and the knives. Consequently, the knives gradually move away from the die, leading to a degradation in the process and eventual equipment shutdown, resulting in significant production losses. The degradation process can be indirectly monitored by observing the presence of agglomerates on the vibrating screen. Fortunately, a wealth of telemetry data is available, offering the potential to predict process degradation in advance, typically within an hour. Additionally, extruder downtime data is also recorded, providing valuable insights. We possess a comprehensive set of telemetry data spanning an entire year.
General task: Utilize this data to develop and deploy a predictive system capable of anticipating equipment shutdowns.
As part of the pilot project, we successfully implemented and trained LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) models.
Obtained results (LSTM):
- Training Accuracy= 0.97
- Test Accuracy= 0.89
Obtained results (GRU):
- Accuracy= 1.000
- Test Accuracy= 0.87
These results can be further enhanced through continued collaboration with domain specialists and by implementing more advanced data preprocessing techniques.