Detecting anomalies in log files using the Damerau-Levenshtein distance metric

In recent decades, significant research effort has been put into developing solutions to support automated or semi-automated analysis of log files. A large number of algorithms appeared based on neural networks. This paper introduces a new approach to anomaly detection in log files that does not rely on neural networks. The building blocks of our approach have been well-known in machine learning for a long time. The author proposes to use a weighted Damerau-Levenshtein distance metric to quantify the similarity between log sequences. The author introduces a kNN-based algorithm for semi-supervised log anomaly detection, and an HDBSCAN-based solution for the unsupervised problem. For the latter, he extends the algorithm by incorporating a manual feedback mechanism, enabling human domain experts to modify sequence labels when necessary.

TY - JOURAU - Horvath, GaborAU - Mészáros, AndrásAU - Charaf, KamelAU - Szilágyi, PéterPY - 2026/01/10SP - T1 - Detecting anomalies in log files using the Damerau-Levenshtein distance metricVL - 40DO - 10.1007/s10618-025-01182-8JO - Data Mining and Knowledge DiscoveryER -

For full paper: https://www.researchgate.net/publication/399652190_Detecting_anomalies_in_log_files_using_the_Damerau-Levenshtein_distance_metric

log-parsers-transform-log-lines-to-log-templates

f1-scores-of-the-semi-supervised-algorithm-on-hdfs-as-the-function-of-k-and-the-weighted

f1-scores-of-the-semi-supervised-algorithm-on-bgl-as-the-function-of-k-and-the-weighted

f1-scores-of-the-semi-supervised-algorithm-on-pbs-mom-as-the-function-of-k-and-the

comparison-of-the-semi-supervised-results-with-other-solutions-error-bars-show-95

Call us Zalo
Loading...