[3] Anomaly detection finds application in many domains including cybersecurity, medicine, machine vision, statistics, neuroscience, law enforcement and financial fraud to name only a few.
Anomalies were initially searched for clear rejection or omission from the data to aid statistical analysis, for example to compute the mean or standard deviation.
[4] By the late 1970s and early 1980s, the analysis of these logs was primarily used retrospectively to investigate incidents, as the volume of data made it impractical for real-time monitoring.
The affordability of digital storage eventually led to audit logs being analyzed online, with specialized programs being developed to sift through the data.
[4] As the field has continued to develop, the focus has shifted to creating solutions that can be efficiently implemented across large and complex network environments, adapting to the ever-growing variety of security threats and the dynamic nature of modern computing infrastructures.
[13][14] With the advent of deep learning technologies, methods using Convolutional Neural Networks (CNNs) and Simple Recurrent Units (SRUs) have shown significant promise in identifying unusual activities or behaviors in video data.
[13] These models can process and analyze extensive video feeds in real-time, recognizing patterns that deviate from the norm, which may indicate potential security threats or safety violations.
[17] Detection anomalies can help identify and pre-empt potential performance degradations or system failures, thus maintaining productivity and business process effectiveness.
[21] Aljameel et al. propose an advanced machine learning-based model for detecting minor leaks in oil and gas pipelines, a task traditional methods may miss.
Some of the popular techniques are mentioned below and are broken down into categories: Also referred to as frequency-based or counting-based, the simplest non-parametric anomaly detection method is to build a histogram with the training data or a set of known normal instances, and if a test point does not fall in any of the histogram bins mark it as anomalous, or assign an anomaly score to test data based on the height of the bin it falls in.
Instances in low probability areas of the distribution are then considered anomalies[25].Histogram-based Outlier Score (HBOS) uses value histograms and assumes feature independence for fast predictions.
[55] Dynamic networks, such as those representing financial systems, social media interactions, and transportation infrastructure, are subject to constant change, making anomaly detection within them a complex task.