# Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

1 Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories, Hong Kong School of Control Science and Engineering Dalian University of Technology Dalian, Liaoning, China

2 Outline Introduction Problem formulations kwta networks Simulation results Sorting application Filtering Application Concluding remarks Future works References

3 Multiple Winners-take-all Operation The k-winners-take-all (kwta) operation is to select the k largest inputs out of n inputs (1 k < n). kwta is a general rule in nature and society. kwta has widespread applications in data mining, machine learning, classification, clustering, computer vision, etc. It is a common building block for many models such as ART and SOM.

4 k Winners-take-all Operation As the number of inputs increases and/or the selection process should be operated in real time, parallel algorithms and hardware implementation are desirable.

5 Parallel k Winners-take-all Operation x 1 x 2 x n k u 1 u 2 u n

6 Problem Formulations "The mere formulation of a problem is far more essential than its solution, which may be merely a matter of mathematical or experimental skills. To raise new questions, new possibilities, to regard old problems from a new angle requires creative imagination and marks real advances in science." Albert Einstein

7 Problem Formulations

8 Problem Formulations (cont d)

9 Problem Formulations (cont d)

10 Problem Formulations (cont d)

11 Model Selection and Redesign The ktwa problem has been formulated as an equivalent linear and quadratic programming problems. All existing neurodynamic optimization models for linear and quadratic programming can be applied. Now the question is: which is the best in terms of model complexity and computational efficiency?

12 QP-based Primal-Dual Network

13 QP-based Projection Network

14 LP-based Projection Network

15 QP-based Simplified Dual Net

16 LP-based Discontinuous Network

17 Discontinuous Activation Function

18 Convergence Conditions

19 Simulation Results

20 Simulation Results (cont d)

21 Simulation Results (cont d)

22 QP-based Discontinuous Network

23 Discontinuous Activation Function

24 Convergence Condition

25 Simulation Results

26 Simulation Results(cont d)

27 Simulation Results (cont d)

28 QP-based Improved Dual Network

29 Model Comparisons Model Number of layer(s) Number of neuron(s) Number of connections LP-based primal-dual network QP-based primal-dual network 4 3n + 1 6n n + 1 6n + 2 LP-based projection network 2 n + 1 2n + 2 QP-based projection network QP-based simplified dual network 2 n + 1 2n n 3n LP-based discontinuous net 1 n 2n QP-based discontinuous network QP-based improved dual network 1 n 2n 1 1 n

30 Simulation Results

31 Discrete-time Counterpart

32 Activation Function with High Gain

33 A New Model

34 Desirable Properties The kwta model with Heaviside activation function has been proven to be globally stable and globally convergent to the kwta solutions in finite time. Derived lower and upper bounds of convergence time are respectively It essentially solves the dual problem of the linear programming formulation.

35 Convergence Time As a linear system with a discontinuous bias, the converence time of the kwta network can be computed as a function of input vector u. The expectation and variance of the convergence time can also be computed, based on Binomial distribution, as functions of initial states. Y. Xiao, Y. Liu, C.-S. Leung, J. P.-F. Sum, K. Ho, Analysis on the convergence time of dual neural network-based kwta, IEEE Trans. Neural Networks and Learning Systems, vol. 23, pp , J. P.-F. Sum, C.-S. Leung, K. Ho, Effect of Input Noise and Output Node Stochastic on Wang's kwta, IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp , 2013.

36 Reformulated Problem

37 Reformulated Problem (cont d)

38 Reformulated Problem (cont d)

39 Simulation Results with Randomized Integer Inputs

40 Simulation Results with Low- Resolution Inputs

41 Initial State Estimation Although the state of kwta model is guaranteed to be globally convergent in finite time from any initial state, prior information is helpful to initialize the state closely to the steady state. Obviously, the steady state of y (u k+1, u k ] depends on the distribution of u 1, u 2,..., u n, as well as the values of k and n.

42 Initial State Estimation (cont d) General distribution Uniform distribution Normal distribution

43 Initial State Estimation (cont d)

44 Uniform Distribution

45 Normal Distribution

46 Simulation Results (convergence time) with Infinity Gain

47 Simulation Results (convergence time) with Unity Gain

48 Discrete-time Version

49 Simulation Results (n = 10 6, k = n/2)

50 Simulation Results (n = 10 6, k = n/2)

51 Monte Carlo Simulation Results

52 Monte Carlo Simulation Results

53 Estimated Complexity (uniform)

54 Estimated Complexity (normal) For data with a dimension of (1 Googol), it would need about 8.44 iterations on average!

55 Histograms of Convergence Iterations

56 Histograms of Convergence Iterations

57 Histograms of Convergence Iterations

58 Histograms of Convergence Iterations

59 Sorting Operation Sorting is a fundamental process to arrange data in an order according to their values. It accounts for 25% of data processing time (Knuth). For sorting with large number or high dimensional data, parallel sorting approaches are more desirable. Numerous sorting algorithms and models have been developed with varied efficiencies.

60 Parallel Sorting Representation For example, a permutation matrix:

61 Parallel Sorting Representation (cont d) A modified version:

62 Logic Reversal A simple logic can be used to flip over the redundant '1' elements after the first '1' in each row; i.e.,

63 Parallel Sorting based on kwta Let each kwta network computes one column of the above sorting matrix from left to right with k increasing from 1 to n - 1. Specifically, a WTA network with a single state variable (i.e., k=1) is adopted to determined the largest element of the list. Next, a kwta network with k = 2 computes the second item in the list without recounting the first item.

64 Parallel Sorting based on kwta As such, the whole list of n items can be sorted using n-1 kwta networks without the need for computing the last item. As a result, only n-1 neurons will be needed. It is a substantial reduction of the model complexity compared with the analog sorting networks with n 2 neurons.

65 Illustrative Example In this case, only five (5) neurons are needed by using five kwta networks here. In contrast, 36 neurons are needed in the analog sorting network (Wang, 1995).

66 Simulation Results (state variable)

67 Simulation Results (output variables)

68 Rank-order Filter Rank order filters are nonlinear filters with many applications including digital image processing, speech processing, coding and digital TV, etc. A rank order filter functions by working by selecting its input with a certain rank as its output. Rank order filters entails substantial processing power to implement, which limits their real-time signal processing applications.

69 Rank-order Filter Based on kwta Nevertheless, rank order filters can benefit from their parallelism realizations. Specifically, a kwta network with k = r is used in parallel to another kwta network with k = r 1 to select the input with its rank order being r.

70 Simulation Results (median filter)

71 Simulation Results (median filter)

72 Simulation Results (median filter)

73 Image Processing Percentage of speckle noise in image 10%

74 Image Filtering (cont d)

75 Image Filtering (cont d)

76 Image Filtering (cont d)

77 Image Filtering (cont d)

78 Image Filtering (cont d) Put the original image into median filter The Original image Original image after median filtering

79 Color Image Filtering Percentage of speckle noise in image 10%

80 Color Image Filtering (cont d) Percentage of speckle noise in image 10%

81 Color Image Filtering (cont d)

82 Color Image Filtering (cont d)

83 Color Image Filtering (cont d)

84 Results & Discussion - Image Processing

85 Color Image Filtering (cont d)

86 Color Image Filtering (cont d)

87 Information Retrieval The efficiency of information retrieval from large database is essential. The techniques for information retrieval from large data sets play a very important role as the size of the world-wide web exceeded possibly more than 30 billion nowadays.

88 Web Information Retrieval There are basically two parts in web information retrieval: One is calculating the weight of all the pages or data. The other is find the most wanted k results with highest weightings. The second one is the top-k query or front page problem.

89 A Toy Problem from Wikipedia 7 pages 17 links The PageRank weight of each page and link is provided.

90 Selection Results (k=3) Output vector x=[1,1,0,0,1,0,0] T Pages 1, 2, and 5 are with higher PageRank weights

91 Film-director-actor-writer Network Crawled from Wikipedia under the category of English language films 34,279 pages 142,426 links Part of the square adjacency matrix is shown by the figure, where a dot on the i th column and the j th row represents that there is a directed link pointed to the j th page from the i th one. The rest of the matrix is 0.

92 Selection Results (k=10) The answer to this query [3111, 3869, 4058, 4621, 6938, 8974, 10341, 11502, 13320, 15326] T can be easily achieved from the sparse representation of the output vector x = g(u i -y(t)), where 10 of the elements are nonzero.

93 Conclusions and Future Works The neurodynamic optimization approaches are demonstrated to be powerful for k-winners-take-all operations. k-winners-take-all neural networks provide parallel distributed computational models with guaranteed global convergence to the optimal solutions. Neurodynamic optimization approaches are more suitable for real-time applications with big data. GPU-based implementation is under way. Applications to other problems such as recommender systems are yet to be done.

94 Acknowledgments Prof. Yousheng Xia (Fuzhou University) Prof. Yunong Zhang (Sun Yat-sen University) Prof. Xiaolin Hu (Tsinghua University) Prof. Qingshan Liu (Huazhong Univ. of Sci. and Tech.) Dr. Shubao Liu (GE Global Research) Dr. Zheng Yan (Huawei Shannon Laboratory) Mr. Yunpeng Pan (Georgia Institute of Technology) Mr. Zhishan Guo (University of North Carolina) Mr. Shaofu Yang and Miss Xinyi Le (Chinese University of Hong Kong) Many projects funded by the Hong Kong Research Grants Council.

95 Q & A

