PAPERmaking! Vol6 Nr2 2020

LEEANDSEO

12of 19

TABLE 2 Confusion matrix

LSTM autoencoder Decision tree SVM CL-LNN RD-LNN Remark

Autoencoder

Prediction 0 1

0 1 3355 19

0 1 1636 19 179 6

0 1 0 1 0 1 0 1 1769 24 1514 19 1762 22 TN FN

0 1

2726 22 173 3

272 8 46 1 282 6 34 3 FPTP Abbreviations: CL-LNN, class label of the local nearest neighbor; LSTM, long short-term memory; RD-LNN, relative distance of the local nearest neighbor; SVM, support vector machine.

TABLE 3 Performance with four metrics

LSTM autoencoder

Decision tree

Item

Autoencoder

SVM CL-LNN RD-LNN

Precision

0.017

0.029 0.296 0.075 0.052

0.032 0.240 0.099 0.057

0.021 0.021 0.040 0.240 0.024 0.157 0.028 0.038

0.081 0.120 0.019 0.097

True positive rate 0.120 False positive rate 0.060

F-measure

0.030

Note : The boldfaced value implies the highest performance for each measure. Abbreviations: CL-LNN, class label of the local nearest neighbor; LSTM, long short-term memory; RD-LNN, relative distance of the local nearest neighbor; SVM, support vector machine.

used to measure the performance of a rare classification problem. F-measure (also sometimes called the F1 score or F-score) is the combination of precision and recall using the harmonic mean, a type of average being used for rates of change. Based on the table, RD-LNN shows the best performance in precision, false-positive rate, and F-measure among six methods while LSTM autoencoder only performs better in a true positive rate. Note that RD-LNN shows outstanding performance compared with the others in F-measure which are well suited to represent the performance of the highly imbalanced dataset. Another metric used to measure the performance is a receiver operating characteristic curve, or ROC curve, which represents the diagnostic ability of a binary classifier. This tool is suitable to visualize and compare the per- formance of our proposed algorithms. The true positive rate (TPR or sensitivity) is plotted in the ROC curve against the false positive rate (FPR or 1 - specificity) at different threshold settings to exhibit how much a model is able to distinguish classes. In Figure 5, ROC curves of six different methods are plotted to compare the performance using area under the ROC curve (AUC) which represents the degree of separability. LSTM-Autoencoder which considers temporal features show better performance than Autoencoder and the AUC of RD-LNN is higher than the one of CL-LNN by consider- ing the relative distance to detect the failures. Decision tree and SVM which are not adopting the feature extraction we proposed also provide a lower performance than RD-LNN. Overall, the AUC of RD-LNN shows the largest value 0.724, and we reach to the same conclusion that the performance of RD-LNN is better than any other five meth- ods. Figure 6 summarizes performance comparison based on F-measure and AUC. LSTM-autoencoder and RD-LNN appear to be better than the others in AUC, while RD-LNN is the only one to show the outstanding performance in F-measure. Additional experiment is conducted separately to choose distance measure algorithm between Euclidean distance and DTW distance considering that 1-NN method requires demanding calculation of distance between the target data point and all the points in the training set. In the experiment comparing two methods to measure the distance in Table 4, we found that DTW distance spends much more time to complete the same task than Euclidean distance which takes only about 4.6 minutes while it shows the almost same performance. The reason why Euclidean distance performs well compared with DTW in this experiment is that DTW distance is particularly well suited for the application of automatic speech recognition in which speaking speeds vary based on time. However, time-series data that has been used here has the same time difference.

Made with FlippingBook - Online catalogs