Data preprocessing for anomaly based network intrusion detection : a review
Data preprocessing is widely recognized as an important stage in anomaly detection. This paper reviews the data preprocessing techniques used by anomaly-based network intrusion detection systems (NIDS), concentrating on which aspects of the network traffic are analyzed, and what feature construction and selection methods have been used. Motivation for the paper comes from the large impact data preprocessing has on the accuracy and capability of anomaly-based NIDS. The review finds that many NIDS limit their view of network traffic to the TCP/IP packet headers. Time-based statistics can be derived from these headers to detect network scans, network worm behavior, and denial of service attacks. A number of other NIDS perform deeper inspection of request packets to detect attacks against network services and network applications. More recent approaches analyze full service responses to detect attacks targeting clients. The review covers a wide range of NIDS, highlighting which classes of attack are detectable by each of these approaches.
Data preprocessing is found to predominantly rely on expert domain knowledge for identifying the most relevant parts of network traffic and for constructing the initial candidate set of traffic features. On the other hand, automated methods have been widely used for feature extraction to reduce data dimensionality, and feature selection to find the most relevant subset of features from this candidate set. The review shows a trend toward deeper packet inspection to construct more relevant features through targeted content parsing. These context sensitive features are required to detect current attacks.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
|Item Type:||Journal Article|
|Subjects:||Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Pattern Recognition and Data Mining (080109)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > COMPUTER SOFTWARE (080300) > Computer System Security (080303)
|Divisions:||Past > Institutes > Information Security Institute|
|Copyright Owner:||Copyright 2011 Elsevier Advanced technology|
|Deposited On:||15 Sep 2011 05:04|
|Last Modified:||13 Dec 2011 00:02|
Repository Staff Only: item control page