SWAT
SWaT is a key asset for researchers aiming at the design of safe and secure cyber physical systems (CPS.) The testbed consists of a modern six-stage water treatment process that closely mimics a real world treatment plant. Stage 1 of the physical process begins by taking in raw water, followed by chemical dosing (Stage 2), filtering it through an Ultrafiltration (UF) system (Stage 3), dechlorination using UV lamps (Stage 4), and then feeding it to a Reverse Osmosis (RO) system (Stage 5). A backwash process (Stage 6) cleans the membranes in UF using the RO permeate. SWaT dataset consists of 11 days of continuous operation – of which 7 days’ worth of data was collected under normal operation while 4 days’ worth of data was collected with attack scenarios.
The dataset contains 79 features with a total of 14,996 data
Data preprocessing:
We encoder Categorical variable by using labelEncoder
Normalized the dataset by using MinMaxScaler
Split dataset as training dataset and validation dataset
Training Process
Considering the timing serialization of SWAT, we choose LSTM as the training model.
For our model, we use 1 LSTM layer, 1 dropout layer and 1 fully connect layer, for activation function we will use the sigmoid function.
Dataloader:
We are packing data as TensorDataset and divide as different batch, the batch size is 60.
For the training
Epochs number: 20
Evaluation : After 20 epoch training, the final accuracy can reach 0.95 on valid dataset and 0.933 on training dataset
streamlines continual model updating:
To address concept drift, we introduce the continual model update framework.
The model update strategies are inspirited by online learning concept. The framework is composed by two components.
Runtime Profiler:
This component is used for logs and profiles the training time for each model. According to the dataset size, we can predict their approximate training time by using learning regression.
Update Controller:
Update Controller is used to decides when to perform model update.
To finding most suitable timing for model update, we calculate the Data Incorporation Latency. Data Incorporation Latency indicates how fast the data is incorporated into the model, and we measure the delay between the arrival of each sample and the time at which that sample is incorporated.
Suppose we have m data samples arrive in sequence at time a_1,…., a_m. The system performs n model updates, where the i-th update starts at s_i and takes t_i time to complete. We have D_i be the set of data samples incorporated by the i-th update, which arrived after the (i-1)-th update starts and before the i-th:
Since all sample in D_i get incorporated after the i-th update completes, the cumulative latency is computed as
Summing up L_i over all n updates, we obtain the data incorporation latency
To minimum the training cost, we propose a cost-aware policy for fast data incorporation at low training cost. we introduce “knob” parameter w, meaning, for every unit of training cost it spends, it expects the data incorporation latency to be reduced by w. In this model, latency Li and cost τi are “exchangeable” and are hence unifed as one objective, which we call latency-cost sum, i.e.
To compute the maximum latency reduction, we develop cost-aware model update strategy.