From 7d4dcdbd4e71498ecca2519ed576f3ff1343f26c Mon Sep 17 00:00:00 2001 From: "Sridhar K. N. Rao" Date: Thu, 26 Aug 2021 20:27:57 +0530 Subject: [PATCH] This patch adds 2 research studies. Rename reserachstudies Signed-off-by: Sridhar K. N. Rao Change-Id: Ifca6e0ab820b7ebb3e533b2046f6632288be0029 --- research-studies/ml-problems-techniques-nfv.md | 66 ++++++++++++++++++++++++++ research-studies/oss-projects-nfv.md | 16 +++++++ 2 files changed, 82 insertions(+) create mode 100755 research-studies/ml-problems-techniques-nfv.md create mode 100755 research-studies/oss-projects-nfv.md diff --git a/research-studies/ml-problems-techniques-nfv.md b/research-studies/ml-problems-techniques-nfv.md new file mode 100755 index 0000000..63a4ba9 --- /dev/null +++ b/research-studies/ml-problems-techniques-nfv.md @@ -0,0 +1,66 @@ +# Machine Learning Problems and Techniques - State of Art + +| Work | Methods/Algorithms | Data | Inferences | Implementation Details (if any) | Gaps (scope for novelty) | +| ---------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Auto Scaling of Containers: the Impact of Relative and Absolute Metrics. (Emiliano Casalicchio, et al. 2017 \[5\]) | Pearson Correlation | Metrics: Relative CPU utilization,
Absolute CPU utilization.
Events: CPU Intensive workload. | Pearson correlation is restricted to linear relationship between variables.

Time consuming method.

Unlike Pearson, Spearman's correlation is not restricted to linear relationships. | | In Literature most of them have used Pearson correlation | +| Correlating Events with Time Series for Incident Diagnosis.

(Chen Luo, et al,ACM- 2014 \[6\]) | Pearson Correlation,
J-Measure /correlation, | Metrics: CPU usage,
Memory Usage,
Disk Transfer Rate.
Event: CPU Intensive program,
Memory Intensive program,
Disk Intensive Program. | | +| Event Correlation for Log Analysis in the Cloud
(Meera G, et al, IEEE-2016 \[7\]) | Attribute-based correlation,
Conjunctive Correlation,
Disjunctive correlation. | Logs from Openstack. | | +| Causality Inference for Failure in NFV
(Dan Kushnir, et al, IEEE-2016, \[8\]) | Pearson Correlation. | Host high CPU consumption. | | +| A Fault Correlation approach to Detect Performance Anomalies in Virtual Network Function Chains
(Domenico Cotroneo, et al, IEEE-2017 \[9\]) | Pearson Correlation. | Metrics: CPU utilization,
Memory utilization.
Event: Sudden Workload surges | | +| Event Correlation for Intrusion Detection Systems. 2015 | Not Mentioned | | | +| Correlating multiple Events and Data in an Ethernet Network 2017 | | | +| Grey System Correlation-based Feature Selection for Time Series Forecasting, 2018 | | | +| An Outlier Detection Algorithm Based on Cross-Correlation Analysis for Time Series Dataset, 2018 | Pearsorson correlation | | | +| A Deep Learning Approach to VNF Resource Prediction using Correlation between VNFs NetSoft 2019 | Pearson Correlation,
J-Measure /correlation, | vnf, vCPU, vRAM, SFC History | | +| Enhanced Network Anomaly Detection Based on Deep Neural Networks,2015 | k NN, Decision Tree, Autoencoders | NSL-KDD | SVM, DT, KNN, Random Forest algorithms are comonly used in the existing work.

Challenges: With high dimensions, difficult to estimate distributions & Building a training set with all anomalous traffic properly labelled is the main issue. Increasing Network Complexity, Model Scalability and Transferability and Feature Extraction
| | Many researchers have used machine learning techniques for network management. From the Literature survey it is found that machine learning techniques (classification) have used to classify the network anomaly. | +| Statistical-based Anomaly Detection for NFV Services
(Georgios Xilouris, et al , 2016-IEEE \[1\]) | Multivariate outlier detection methods-
Multiple Linear Regression,
Mahalanobis Distance. | CPU load,
Memory Utilization,
Traffic Processed. | | +| Using Machine Learning to Detect Noisy Neighbours in 5G Networks.

(Udi Margolin, et al. 2016.\[4\]) | Classification method-
Support Vector Machine,
Random Forest. | CPU Utilization of the server VM,
In-bound network traffic of the server VM,
Out-bound n/w traffic of the server VM,
CPU utilization of noisy neighbours. | | | +| Anomaly Detection and Root Cause Localization in Virtual Network Functions 2016 | Random Forest | IMS Clearwater: Cpu Consumption , Memeory Leaks , Anomalous number of disk access, Network Anomlay, Heavy Workload | | RNN and LSTM are commonly used in the existing work. Proves that LSTM RNN can detect patterns of malicious traffic effieciently. | +| Towards Black-Box Anomaly Detection in Virtual Network Functions

(Carla Sauvanaud, et al. 2017-IEEE \[2\]) | Classification method-
Random Forest | CPU Consumption, Misuse of Memory, Disk Access,
Packet Loss, Network Latency. | | +| Evaluating Machine Learning Algorithms for Anomaly Detection in Clouds.
(Anton Gulenko, et all. 2017-IEEE \[3\]) | Classification methods-
Support Vector Machine,
Decision Trees,
Random Forest. | CPU Utilization, RAM Utilization,
Hard Disk access in bytes and access time, Disk space per partition, Network utilization in bytes and packets, Number of Processes. | | | +| A multi-layer perceptron approach for flow-based anomaly detection,2017 | MLP-Multilayer perceptrons, J48 | KDD99 Dataset | RNN and LSTM are commonly used in the existing work. Proves that LSTM RNN can detect patterns of malicious traffic effieciently. | | Scope of Novelty: Most of the existing work uses the RNN techniques to classify the network anomaly. There is a scope for implementing the new variant of LSTM and encoder-decoder model to improvize the existing results in terms of accuracy. | +| DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning, 2017 | LSTM, RNN | HDFS log data set, OpenStack log data set | https://github.com/wuyifan18/DeepLog | +| An Empirical Evaluation of Deep Learning for Network Anomaly Detection,2018 | Fully Connected Network (FCN), Variational AutoEncoder (VAE), and Long Short-Term Memory with Sequence to Sequence (LSTM Seq2Seq) s | NSL-KDD | [https://github.com/dcstamuc/NetworkAnomaly](https://github.com/dcstamuc/NetworkAnomaly) | combintaion of CNN layer with LSTM can achive the good result. Since, CNN are good at feature extraction when compare to LSTM. | +| Self Adaptive Deep Learning Based System for Anomaly Detection in 5G network, 2018 | SVM, Restricted Boltzman Machine, | Publicaly available CTU dataset. | | | +| Network Traffic Anomaly Detection Using Recurrent Neural Networks,2018 | RNN, LSTM | ISCX IDS dataset which contains seven days of network activity with normal traffic and four distinct attack patterns. | [https://github.com/benradford/replication\_arxiv\_1803\_10769](https://github.com/benradford/replication_arxiv_1803_10769) | CNN can extract effective features from the data, and extracted features are given input of the LSTM to identify the temporal dependencies of the change of the postures during the instance sequence this method can effectively improve the accuracy. | +| Investigating Adversarial Attacks against Network Intrusion Detection Systems in SDNs. IEEE 2019 | Random Forest (RF) Support Vector Machine (SVM) Logistic Regression (LR) K-Nearest Neighbour (KNN) | KDD99 Dataset | [https://github.com/sdshayward/AdversarialSDN-Hydra](https://github.com/sdshayward/AdversarialSDN-Hydra) | | +| A Machine Learning based SLA-Aware VNF Anomaly Detection Method in Virtual Networks 2020 | XGBoost, Distributed Random Forest | VNFs in an OpenStack-based NFVI environmen | | | +| HitAnomaly: Hierarchical Transformers for Anomaly Detection in System Log, 2020 | Encoder-Decoder, LSTM | experiments on three datasets: the HDFS dataset , the BGL dataset and the OpenStack dataset | | | +| Proactive VNF Scaling with Heterogeneous Cloud Resources: Fusing Long Short-Term Memory Prediction and Cooperative Allocation. 2020 | LSTM, RNN | CPU, Memory | | | +| Anomaly detection for electricity
consumption in cloud computing:
framework, methods, applications, and
challenges, 2020 | Bayesian Network, SVM, KNN, DNN, RNN | Smart Grid Data | | | +| An Accuracy Network Anomaly Detection Method Based on Ensemble Model, 2021 | Tree Based Models : Xgboost, LightGBM and Catboos | | | | +| ENAD: An Ensemble Framework for Unsupervised Network Anomaly Detection, 2021 | Autoencoders & GAN | Public Datasets:UNSW-NB15 and CICIDS2017 | | | +| Network Traffic Anomaly Detection Using Recurrent Neural Networks, 2021 | RNN, LSTM | ISCX IDS dataset which contains seven days of network activity with normal traffic and four distinct attack patterns. | [https://github.com/benradford/replication\_arxiv\_1803\_10769](https://github.com/benradford/replication_arxiv_1803_10769) | | +| SymPerf: Predicting Network Function Performance Felix Rath, et al | SVM, Decision Trees. | CPU cycles. | ARIMA model is used in the existing works and it is well suited for short term time series forecasting. In ARIMA, the current value of the time series is expressed linearly in terms of its previous values. | | | +| vNMF: Distributed Fault Detection using Clustering Approach for Network Function Virtualization : Masanori Miyazawa, et al 2015 \[10\] | Clustering methods: K-Means, SOM (Self Organizing Map) | Virtual & Physical Infra: CPU Usage, Memory Usage, Disk I/O, Network I/O(Throughput) | For the long term data RNN and LSTM are well suited and many of the existing work uses LSTM for the forecasting. | | | +| PreFix: Switch Failure Prediction in Data center Networks: YING LIU, et al | Log based failure prediction methods: SKSVM, HSMM, PreFix. | state changes of interfaces, configuration changes, powering down of devices, plugging in/out of a line card, operational maintenance. | LSTM yields better results for long term modeling. | [https://github.com/smallcowbaby/PreFix](https://github.com/smallcowbaby/PreFix) | | +| Workload Prediction using ARIMA Model and its impact on Cloud Applications using QoS 2015 | ARIMA | Real traces of requests to the web server from the wikimedia. Contains the number of http request received for each resources . | [https://github.com/AnjieCheng/Wikipedia-Workload-Prediction](https://github.com/AnjieCheng/Wikipedia-Workload-Prediction) | Scope of Novelty: Most of the work done with the use of RNN techniques. We should develop novel model which will perform better when comare to the existing work with respect to accuracy. We can implement the new variant of LSTM model along with the transformer technique (attention) mechanism. | +| Predicting network attack patterns in SDN using machine learning approach. IEEE 2016 | Bayesian Network, NaiveBayes, Decision Tree | Public Dataset "LongTail" - SSH brute force attacks. | [https://github.com/Adildangui/CN2-Project](https://github.com/Adildangui/CN2-Project) | | +| Research on the prediction model of cpu utilization based on ARIMA-BP Neural Network . 2016 | ARIMA, Back Propagation Neural Network. | Cpu metrics | | CAB-LSTM: Failure Prediction Analysis using Correlation Attention Based LSTM | +| Forecasting and anticpating SLO Breaches in Programmable Networks , IEEE 2017 | RNN, LSTM | IMS Clearwater VNF's data,( Ellis, Bono, Homer, Ralf) | [https://github.com/heekof/Forecasting-ANN](https://github.com/heekof/Forecasting-ANN) | | +| Forecasting and anticpating SLO Breaches in Programmable Networks , IEEE 2017 | Correlation acrros VPEs, LSTM, Autoencoder | Network Trouble Tickets and VNF syslogs collected from 38 vPEs.(Virtualized Provider Edge Router) | | | +| Deep Learning based Link Failure Mitigation,2017 | RNN, LSTM | | | | +| Automated IT System Failure Prediction: A deep Learning Approach, 2018 | TF-IDF, RNN, LSTM | Data set collected from two large enterprise: A web server cluster (WSC) and a mailers server cluster (MSC) | | | +| RNN based traffic prediction for pro-active scaling of the AMF.2018 | DNN, RNN-LSTM | | | | +| A RNN LSTM based predictive autoscaling approach on private cloud.2018. | RNN, LSTM | openstack Environment: VMs CPU and RAM utilization. | | | +| Cellular Traffic Prediction and Classification: a comparative evaluation of LSTM and ARIMA,2019 | ARIMA, LSTM | i) time of packet arrival/departure, ii) packet length, iii) whether the packet is uplink or downlink, iv) the source IP address, v) the destination IP address, vi) the communication protocol | | | +| DeepCog: Cognitive Network Management in Sliced 5G Networks with Deep Learning,2018 | Encoder-Decoder, | service name, serivce class, traffic % | [https://github.com/wnlUc3m/deepcog](https://github.com/wnlUc3m/deepcog/blob/master/DeepCog.ipynb) | | +| Predicting Computer Network Traffic: A Time Series Forecasting Approach Using DWT, ARIMA and RNN, 2018 | DWT, ARIMA and RNN | Dataset consists of internet
traffic (in bits) as recorded by an ISP and it corresponds to
transatlantic link. | | | +| Network Traffic Prediction based on Diffusion Convolutional Recurrent Neural Networks,2019 | Diffusion Convolutional Recurrent Neural Network (DCRNN), CNN, LSTM | Backbone Abilene network, | | | +| Interactive Temporal Recurrent Convolution Network for Traffic Prediction in Data Centers,2018 | ARIMA, GRU, CNN | Yahoo! production workloads to test the effectiveness of different models. | | | +| ST-TrafficNet: A Spatial-Temporal Deep Learning Network for Traffic Forecasting, 2020 | ARIMA, LSTM, Transorfmers | real-world large-scale benchmark dataset- ST-TrafficNet | | | +| Machine Learning for Performance-Aware Virtual Network Function Placement. 2020 | Decision Tree, CART (Classification and Regression Tree), BACON | Evolved Packet Core (EPC) | | | Many of the existing work uses Linear integer programming and SVR techniques for resource allocation. Markov Decision Process and Q Learning are also widely used in the Liteteraure. | +| Machine learning-driven Scaling and Placement of Virtual Network Functions at the Network Edges, 2019 | Multilayer Perceptron (MLP) | The dataset utilized in this work is generated from a commercial MNO in Armenia by monitoring the mobile network traffic load on 6 LTE base stations, | | +| Optimal VNF Placement via Deep Reinforcement Learning in SDN/NFV-Enabled Networks. 2019 | MDP, Markov Decision Process | | | https://github.com/AlbertoCastelo/resource-allocation-opt https://github.com/MarouenMechtri/algorithms | In RL , agent learns to make decisions through rewards or penalties received as a result by executing one or more actions. | +| Combining Deep Reinforcement Learning With Graph Neural Networks for Optimal VNF Placement,2020 | Deep Reinforcement Learning (DRL) | The Reject Ratio of SFC Requests,Influence of Topology Change,Time Consumption | Integer linear programming (ILP) used in some of the existing work . Machine learing models like SVM , Decision Tree and Q-Learning used in the literature. | | Learning from past experiences is a valuable ability to adapt to variations in the environment (i.e. variation in traffic type, network configuration, etc.) | +| COLAP: A predictive framework for service function chain placement in a multi-cloud environment, 2018 | Support Vector Regression, Random Search | Integrated capacity, vCPUs, Memory, Storage | https://github.com/MarouenMechtri/algorithms | | +| End-to-End Performance-Based Autonomous VNF Placement With Adopted Reinforcement Learning, 2020 | MDP, Markov Decision Process | Normalized cpu load, memmory utilization, number of core, bandwidth of link, network utilization, latency b/w two nodes, packet loss b/w nodes, jitter b/w nodes. | | [https://github.com/CN-UPB/NFVdeep ](https://github.com/CN-UPB/NFVdeep) | There is scope for developing the model based on Reinforcement Learning that can minimze cost and aslo impoves the VNF placement. | +| Deep Q-Learning for Dynamic Reliability Aware NFV-Based Service Provisioning,2019 | Q-Learning, Deep Q-Learning | | | https://github.com/farismismar/Deep-Q-Learning-SON-Perf-Improvement | | +| Machine Learning Based Method for Predictions of Virtual Network Function Resource Demands, NetSoft 2019 | LSTM | CPU , Memory, Disk, SFC type, SFC History, | RNN and LSTM models are commonly used in existing work | https://github.com/CN-UPB/ml-for-resource-allocation/blob/master/README.md | | +| Novel Approaches for VNF Requirement Prediction using DNN and LSTM,2019 | DNN, LSTM, Bi-LSTM, CNN-LSTM | Internet Traffic Data Sample, Time, Traffic, Time Stamp | | | +| A long short term memory recurrent neural network framework for Network Traffic Matrix Prediction, 2017 | RNN, LSTM | Real Data from GEANT Network | | | +| NeuTM: A Neural Network Based Framework for Traffic Matric Prediction in SDN,2017 | RNN, LSTM | Real Data from GEANT Network | | | +| Data Synthesis based on Generative Adversarial Networks, 2018 | GAN | Time series Text Data | Generative Adversary Network and LSTM are used in Existing work | https://github.com/mahmoodm2/tableGAN | Most of the data generation is done for images and videos. There is a huge scope for generating synthetic time series data using GAN. | +| Efficient GAN based Anomaly Detection, 2018 | GAN | Time Series Data, KDD Data | https://github.com/houssamzenati/Efficient-GAN-Anomaly-Detection | +| Text Generation With LSTM Recurrent Neural Networks in Python with Keras, 2019 | LSTM | KDD Dataset | https://www.tensorflow.org/text/tutorials/text\_generation | +| Prediction Method of Multiple Related Time Series Based on Generative Adversarial Networks | GRU, LSTM, SVR, MTSGAN | Web Traffic Dataset, NOAA China Dataset | https://github.com/Luoyonghong/Multivariate-Time-Series-Imputation-with-Generative-Adversarial-Networks | \ No newline at end of file diff --git a/research-studies/oss-projects-nfv.md b/research-studies/oss-projects-nfv.md new file mode 100755 index 0000000..8ba16ac --- /dev/null +++ b/research-studies/oss-projects-nfv.md @@ -0,0 +1,16 @@ +# OSS Projects related to AI/ML for NFV Usecases + +| Project | Description | Analytical Use Cases | +| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------- | +| PNDA | Open source Platform for Network Data Analytics.
Aggregates data like logs, metrics and network telemetry.
Efficiently distributes data with publish and subscribe model.
Manages lifecycle of applications that process and analyse data. \[11\] | Predictive analytics..
Path Anomaly detection using in-band OAM.
Service Assurance. | +| Snas | Streaming Network Analytics System is a framework to collect, track and access tens of millions of routing objects in real time. \[13\] | Predictive analysis. | +| DCAE/Holmes
(ONAP) | DCAE is the umbrella name for a number of components collectively fulfilling the role of Data Collection, Analytics, and Events generation for ONAP.
Holmes project provides alarm correlation and analysis for cloud infrastructure and services, including hosts, vims, VNFs and NSs. | Event Correlation
Root Cause analysis | +| Vitrage
(Openstack) | Vitrage is the OpenStack RCA (Root Cause Analysis) service for organizing, analysing and expanding OpenStack alarms & events, yielding insights regarding the root cause of problems and deducing their existence before they are directly detected. \[15\] | Root Cause Analysis. | +| Acumos AI | Acumos AI is a platform and open source framework that makes it easy to build, share, and deploy AI apps. Acumos standardizes the infrastructure stack and components required to run an out-of-the-box general AI environment. This frees data scientists and model trainers to focus on their core competencies and accelerates innovation | Traffic Prediction, | +| EDL is an Elastic Deep Learning | EDL optimizes the global utilization of the cluster running deep learning job and the waiting time of job submitters. It includes two parts: a Kubernetes controller for the elastic scheduling of distributed deep learning jobs, and a fault-tolerable deep learning framework. | Fault Tolerance | +| AI Explainability 360 | AI Explainability 360 is an open source toolkit that can help users better understand the ways that machine learning models predict labels using a wide variety of techniques throughout the AI application lifecycle. | Predictive analysis. | +| ONNX: Open Neural Network Exchange | With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. | [Machine Translation, Image Classification](https://github.com/onnx/models#machine_translation) | +| Pyro | pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend. Pyro enables flexible and expressive deep probabilistic modeling, unifying the best of modern deep learning and Bayesian modeling | Deep Generative Models,Time Series (Pytorch) | +| Horovod | Horovod, a distributed training framework for TensorFlow, Keras and PyTorch, improves speed, scale and resource allocation in machine learning training activities. Uber uses Horovod for self-driving vehicles, fraud detection, and trip forecasting. It is also being used by Alibaba, Amazon and NVIDIA. | Fraud Detection, Trip Forecasting | +| Ludwig | Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code. All you need to provide is your data, a list of fields to use as inputs, and a list of fields to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict on new data. | | +| Angel ML | Angel is a high-performance distributed machine learning platform. It is tuned for performance with big data from Tencent and has a wide range of applicability and stability, demonstrating increasing advantage in handling higher dimension model. | Angel offers several deployment options such as Docker, Yarn and Kubernetes, | -- 2.16.6