Realizing Nilfisk's vision of finding back orders via an Early Warning System.

CLIENT
- Nilfisk
SERVICES USED
- Data Engineering
- Data Science

Challenge
Solution
Results

Challenge

We worked with Nilfisk to develop an Early Warning System (EWS) to reduce back-orders – a business risk which was identified through data insights.

Nilfisk had created a path out in the first common format storing service (a so-called "Datalake") and as part of the first "bathing bridge" into the lake (first use case) would be about an Early Warning System where many common sources would indicate the likelihood and impact to Nilfisk for having backorders in production.

Solution

Algorithm

The models have been implemented using a machine learning algorithm called a Gradient Boosting Machine (GBM). Random forest, artificial neural nets and support vector machines were also considered as potential solutions; however, GBM was found to offer superior performance.

Classification Model

The Classification Model was achieved with an accuracy of around 86%. The model predicted non-back order outcomes because that is the most likely outcome; however, the most relevant statistic for the EWS purposes is whether it assigns a significantly higher probability of backorder to plant, material combinations which eventually had a backorder: Here, FALSE is a backorder and TRUE a non-backorder outcome. The distribution of probabilities for the backorder outcomes lies significantly below that of the non-backorder outcomes.

Validation

The EWS models have been validated using 8-fold, 4 times repeated cross-validation. This means that the whole set of training data has been split into 8 parts, and in 8 different runs (model training). Seven of 8 parts (folds) have been used to train the algorithm and 1 fold has been used to compute the achieved accuracy/fit. Using cross-validation, all the data is used for both training (7 times) and validation (once, when not used for training). This process has been repeated 4 times, each with a random cut into 8ths, to avoid any random selection effects. The purpose of the cross-validation is to ensure that the models hold beyond the data that it has been trained on (i.e. to prevent “overfitting”).

Results

The model successfully created data-driven early warnings. Therefore, it created:

advice on problems in supply chain, supply, components, etc.
root cause analysis and overview and interactions (possibly)
overview and prioritization of focus areas with high impact.

Quantitative business results were significant, but are not open for disclosure.