|国家预印本平台
首页|Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

来源:Arxiv_logoArxiv
英文摘要

This study investigates the efficacy of machine learning models in network security threat detection through the critical lens of partial versus complete flow information, addressing a common gap between research settings and real-time operational needs. We systematically evaluate how a standard benchmark model, Random Forest, performs under varying training and testing conditions (complete/complete, partial/partial, complete/partial), quantifying the performance impact when dealing with the incomplete data typical in real-time environments. Our findings demonstrate a significant performance difference, with precision and recall dropping by up to 30% under certain conditions when models trained on complete flows are tested against partial flows. The study also reveals that, for the evaluated dataset and model, a minimum threshold around 7 packets in the test set appears necessary for maintaining reliable detection rates, providing valuable, quantified insights for developing more realistic real-time detection strategies.

Adrian Pekar、Richard Jozsa

计算技术、计算机技术

Adrian Pekar,Richard Jozsa.Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows[EB/OL].(2025-06-30)[2025-07-16].https://arxiv.org/abs/2407.02856.点此复制

评论