PAPERmaking! Vol6 Nr2 2020

LEEANDSEO

3of 19

gle framework. These algorithms have been employed as a single classifier or a combination of multiple methods, sometimes called an ensemble, to improve the performance of classification. 12 Although the ensemble-based clas- sifier is known as a prominent algorithm for time series classification tasks, 13 it requires much computation for training, which may not be suitable for a large dataset. Meanwhile, Tan et al 14 describe that the nearest neigh- bor classifier based on the Euclidean distance is a fast and promising classification algorithm when it comes to the big dataset. Recently, MSTS data has gained great attention, and many researchers have proposed new methods to solve the multistream-based problem. Orsenigo and Vercellis 15 describe a classification method based on a temporal extension of discrete SVMs with the notions of warping distance and softened variable margin in the set of multivariate input sequences. Weng and Shen 16 implement a new approach for MSTS classification. The eigenvectors of row-row and column-column covariance matrices of MSTS samples are calculated to extract features and a 1-NN classifier is used for the classification. The authors show that distance-based methods with 1-NNs are an effective way to classify MSTS. Other algorithms have also been used to deal with MSTS. Zhang et al 17 address the challenges of MSTS data by presenting a real-time multiple profiles sensor-based process monitoring system. Feature extraction is considered as one of the popular techniques for MSTS classification. Rodríguez and Alonso 18 use the boosting algorithm to generate new features and a SVM is applied with these metafeatures. Kadous and Sammut 19 seek to generate classifiers that are comprehensible and accurate with metafeatures. The authors describe applications of the sign language recognition and the electrocardiogram signal classification. Li et al 20 suggest feature vector selection approaches for MSTS classification using singular value decomposition. Profile monitoring techniques with the use of the principal component analysis (PCA) method is another way to manage MSTS. Kim et al 21 develop the method to detect profile changes of multistream tonnage signals for forging process monitoring and to classify fault patterns while Chang and Yadama 22 propose a statistical process control framework to monitor nonlinear profiles to identify mean shifts in a profile with discrete wavelet trans- formation and B-splines. Paynabar et al 23 suggest a multiway extension of the PCA technique to classify multi- stream profile data. Grasso et al 24 suggest multiway PCA to deal with the reduction of data dimensionality and the fusion to all the sensor outputs. This article carries out two main multiway extensions of the traditional PCAs to handle MSTS. Deep learning has provided prominent results for this application with the popularity of the neural networks. Zheng et al 25 propose a deep learning framework for MSTS classification using features extracted by a 1-NN with dynamic time warping (DTW). Karim et al 4 utilize the long short-term memory fully convolutional network (LSTM-FCN) and attention LSTM-FCN for MSTS classification. Wang et al 5 utilize a recurrent neural network and adaptive differential evolution algorithm for the same task. Despite the popularity of deep learning, this technique requires a high volume of dataset, and it is not suitable for our problem due to a lack of labeled data. An imbalanced classification problem where the distribution of class labels are severely skewed needs to be well managed due to the poor performance of learning algorithms in the presence of underrepresented data and severely skewed class distribution. This is because most algorithms assume that distributions of the dataset are balanced. 26 The sampling methods which consist of oversampling and undersampling techniques are commonly used to improve clas- sifier accuracy by providing a balanced distribution. 27 The cost-sensitive method is an alternative for the imbalanced learning problem by using different cost matrices that outline the cost for misclassifying data instances. 28 However, the failures in the paper machine occur so rarely that traditional techniques had difficulty in training models effec- tively. Active learning can be one of the most prominent methods which are applied to handle extremely imbalanced data. To deal with highly imbalanced classes, Attenberg et al 29 propose guided learning which is an alternative tech- nique where the agent inquires humans to find training examples representing the different classes. Kazerouni et al 30 suggest an active learning algorithm to learn a binary classifier on a highly imbalanced dataset where most data has negative labels with a very small number of positive ones. Hybrid active learning is presented to leverage an explore-exploit trade-off to improve on margin sampling. Moreover, this active learning technique is combined with state-of-the-art deep learning techniques to improve performance. Fang et al 31 reformulate active learning as a rein- forcement learning problem where the policy plays a role in the active learning heuristic. An agent in the environment tries to find the data to be labeled in a validation set based on the deep Q-network. Haussmann et al, 32 however, choose a deep Bayesian Neural Net for both a base predictor and the policy network to effectively incorporate the input distribution.

Made with FlippingBook - Online catalogs