Anomaly detection: Important steps and how to choose the right method for your assets’ management
September 21, 2021
Here at Yodiwo, quite frequently we are being asked by our business partners about the best anomaly detection method they can use for asset management. Having been facility managers for a long time, they acknowledge the importance of monitoring, identifying, recording, analysing and being timely informed on abnormal behaviours of building assets. Therefore, it is only normal that they are in search of the most efficient and autonomous anomaly detection systems, that will help them make their business more competitive, in all 7 possible ways.
Over the past decades, traditional, non-automated supervised learning anomaly detection methods used a relatively small amount of measurement data, which was monitored throughout the facility equipment’s operation. These metrics were analysed manually by a dedicated team, to draw conclusions about the health of the operating machines and decide on an action plan for repair and maintenance purposes. Unfortunately, these tools focused primarily on repairing already undergoing damages and did not have the functions to help the O&M (Operations & Maintenance) technicians foresee future disfunctions or the AMs (Asset Managers) to discover energy inefficiency patterns.
Today, supervised learning techniques have been improved by encompassing AI (artificial intelligence) capabilities. Nevertheless, in many cases they still cannot be considered adequate. Not only because there are usually thousands of metrics available for analysis, requiring multiple labour hours and manpower, but also because modern business needs have changed. Reduction of operational costs and real-time process optimization are so critical for their viability, that the facility managers are constantly under pressure. The answer to their problems was given by the development of user-friendly, semi-supervised and unsupervised anomaly detections systems, which can provide all the needed information for making the right data-driven decisions.
Semi-supervised learning methods use both “labelled” (as anomalous, defected or outlier) and “unlabelled” data. These can work well with systems where collections of verified anomalous data are available and can be used – together with uncategorised ones – for the training of the autonomous, decision-making system.
In most cases however, no such input-to-output mapping is available beforehand. This is the reason why most of the facility managers ask for smart asset management platforms with unsupervised anomaly detection capabilities, where real-application, unlabelled data is used for the building of a unique, facility-specific anomaly detection system.
In order for data to be categorised as anomalous, they should appear quite rarely, as well as feature significantly abnormal characteristics. They can either be univariate (where the observations focus only on a single characteristic or attribute) or multivariate (where multiple features, even if they don’t hold unusual values individually, they seem anomalous when considered together). Today, as the complexity of facilities increases and the metrics per system component are usually multi-dimensional, anomaly detection systems should be designed to deal with multivariate, intercorrelated anomalies.
These can be further distinguished in:
1.Point Outliers: when incident values fall far away from the rest of the data collection
2.Contextual Outliers: when the incidents, usually regarded as noise, differ in context from the rest of the data
3.Collective Outliers: when we have anomalous incidents, formed by subsets of anomalies
Since we are talking about complex systems, all inputs, processes, and outputs are interweaved. Therefore, the quality of data should be considered as important as the credibility of the categorisation and prediction models.
Most frequently, pre-processing steps, such as normalization or first level filtering, are being used to improve the quality of the data. Further ahead, more steps may be applied to convert data in a certain, appropriate form, before they are given as input to the machine learning algorithms.
Although it seems that an ideal facility management platform would only use entirely unsupervised processes, there must be room for flexibility. The selected anomaly detection system should also provide the ability to feed datasets manually when decided, for training optimization purposes.
As far as it concerns the machine learning algorithm itself, one has so many methods to choose from. Below is a selection of such anomaly detection techniques:
- Statistical anomaly detection methods, such as Z-score or Interquartile Range (IQR).
- Tree-based anomaly detection methods, such as Isolation Forest or Random Forest
- Clustering-based anomaly detection methods, such as Cluster Based Local Outlier Factor (CBLOF), Histogram-Based Outlier Detection (HBOS) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
- Classification-based anomaly detection methods, such as Bayesian Networks, Rule-Based Methods or Artificial Neural Networks
- One-class learning based methods, such as state-of-the-art Autoencoders
So, which one is the best?
It really depends on the specific characteristics and the complexity of the facility. If the applications can be described by simple stochastic models or binary decision trees, then a simple Z-score or Isolation Forest method could be sufficient. On the other hand, if the datasets present a high level of grouping, then a clustering-based technique should be more suitable. Moreover, if we have a big number of multi-dimensional and intercorrelated data, a more elaborate method – such as Autoencoders – would be more preferrable.
In the recent years, the development of IoT platforms has enabled facility managers to get real-time access to sensors, equipment, digital systems and control operations. Valuable data can be continuously collected and analysed, to surpass the traditional reactive approaches and provide reliable, predictive analysis on assets. Every single aspect of business activity can be dully measured and all critical KPIs can be meticulously monitored.
Therefore, asset managers are in position to confidently inform the facility owners on how to run their buildings in an optimal – both technically and financially – way.
In Yodiwo, we are passionate about employing elaborate anomaly detection methods as part of our IoT platform (YodiFEMP), to maximize each application’s performance and enhance the existing asset management practices. Should you need any suggestions on how to tackle the critical anomaly detection issue, we’d be happy to discuss on your ideas and concerns during a 30-minute free consultation call.