PAPERmaking! Vol6 Nr2 2020

LEEANDSEO

10of 19

where Φ is the cumulative distribution function of the standard normal random variable, 𝜇 i and s i are the median and SD of the distance between the target instance and all the training instance with the label i . The smaller RD-LNN is, the less likely the label to be found is reliable. Algorithm 2. RD-LNN feature extraction 1: Input : Multistream window instances W train for training, and W test for testing, class labels of training instances y train , index set of training data Train , index set of test data Test 2: Output : Numeric feature matrices X train with elements x 0 ij and x 1 ij , i ∈ Train , and X test with elements x 0 ij and x 1 ij , i ∈ Test 3: for i ∈ Train do 4: for j ∈{ 1 , … , p } dod 0 = d 1 = NULL 5: initialize arrays to store distance values 6: for k ∈ Train ⧵ i do d = D ED ( x ij , x kj ) 7: for all instances in training data except itself (LOO-CV) 8: if y k = 0 then append d to d 0 9: end if 10: if y k ≠ 0 then append d to d 1 11: end if 12: endfor 13: extract features from distances with label 0 14: d ∗ 0 ← min ( d 0 ) 15: distance to the nearest neighbor with label 0 16: q 0 ← interquartile ( d 0 ) 17: 𝜇 0 ← median ( q 0 ) , s 0 ← stdev ( q 0 ) 18: x 0 ij ← P ( X ≤ d ∗ 0 ) ,where X ∼ N ( 𝜇 0 , s 2 0 ) 19: extract features from distances with label 1 20: d ∗ 1 ← min ( d 1 ) 21: distance to the nearest neighbor with label 1 22: q 1 ← interquartile ( d 1 ) 23: 𝜇 1 ← median ( q 1 ) , s 1 ← stdev ( q 1 ) x 1 ij ← P ( X ≤ d ∗ 1 ) ,where X ∼ N ( 𝜇 1 , s 2 1 ) 24: endfor 25: endfor 26: for i ∈ Test do 27: for j ∈{ 1 , … , p } dod 0 = d 1 = NULL 28: initialize arrays to store distance values 29: for k ∈ Train do d = D ED ( x ij , x kj ) 30: for all instances in training data except itself (LOO-CV) 31: if y k = 0 then append d to d 0 32: end if 33: if y k ≠ 0 then append d to d 1 34: end if 35: endfor 36: extract features from distances with label 0 37: d ∗ 0 ← min ( d 0 ) 38: distance to the nearest neighbor with label 0 39: q 0 ← interquartile ( d 0 ) 40: 𝜇 0 ← median ( q 0 ) , s 0 ← stdev ( q 0 ) 41: x 0 ij ← P ( X ≤ d ∗ 0 ) ,where X ∼ N ( 𝜇 0 , s 2 0 ) 42: extract features from distances with label 1 43: d ∗ 1 ← min ( d 1 ) 44: distance to the nearest neighbor with label 1 45: q 1 ← interquartile ( d 1 ) 46: 𝜇 1 ← median ( q 1 ) , s 1 ← stdev ( q 1 ) 47: x 1 ij ← P ( X ≤ d ∗ 1 ) ,where X ∼ N ( 𝜇 1 , s 2 1 ) 48: endfor 49: endfor Algorithm 2 describes the procedure in which the algorithm generates the numerical values by measuring the prob- ability representing the relative position of the nearest neighbor for each class compared with the other instances with the same class label. We found that this unique feature extraction technique improves the performance of classification

Made with FlippingBook - Online catalogs