Abstract:
Big Data is an active business across the world. With the growing size of data
comes many challenges connected with handing out and ensuring the security of
huge data. In this paper, we propose a Network Intrusion Detection System (NIDS)
model based Random Forests (RF) classifier for anomaly detection of the collected
network traffic. In order to decrease the computational time connected with the
bulk of captured data, we utilize the system of Hadoop, MapReduce and Spark that
have proven to be among the most efficient and fault-tolerant systems. We use the
NSL KDD cup 99 dataset to perform experimental analysis and Non-dominated
Sorting Genetic Algorithm-II (NSGA-II) for feature selection over this dataset.
.