Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data


1 Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories, Hong Kong School of Control Science and Engineering Dalian University of Technology Dalian, Liaoning, China
2 Outline Introduction Problem formulations kwta networks Simulation results Sorting application Filtering Application Concluding remarks Future works References
3 Multiple Winnerstakeall Operation The kwinnerstakeall (kwta) operation is to select the k largest inputs out of n inputs (1 k < n). kwta is a general rule in nature and society. kwta has widespread applications in data mining, machine learning, classification, clustering, computer vision, etc. It is a common building block for many models such as ART and SOM.
4 k Winnerstakeall Operation As the number of inputs increases and/or the selection process should be operated in real time, parallel algorithms and hardware implementation are desirable.
5 Parallel k Winnerstakeall Operation x 1 x 2 x n k u 1 u 2 u n
6 Problem Formulations "The mere formulation of a problem is far more essential than its solution, which may be merely a matter of mathematical or experimental skills. To raise new questions, new possibilities, to regard old problems from a new angle requires creative imagination and marks real advances in science." Albert Einstein
7 Problem Formulations
8 Problem Formulations (cont d)
9 Problem Formulations (cont d)
10 Problem Formulations (cont d)
11 Model Selection and Redesign The ktwa problem has been formulated as an equivalent linear and quadratic programming problems. All existing neurodynamic optimization models for linear and quadratic programming can be applied. Now the question is: which is the best in terms of model complexity and computational efficiency?
12 QPbased PrimalDual Network
13 QPbased Projection Network
14 LPbased Projection Network
15 QPbased Simplified Dual Net
16 LPbased Discontinuous Network
17 Discontinuous Activation Function
18 Convergence Conditions
19 Simulation Results
20 Simulation Results (cont d)
21 Simulation Results (cont d)
22 QPbased Discontinuous Network
23 Discontinuous Activation Function
24 Convergence Condition
25 Simulation Results
26 Simulation Results(cont d)
27 Simulation Results (cont d)
28 QPbased Improved Dual Network
29 Model Comparisons Model Number of layer(s) Number of neuron(s) Number of connections LPbased primaldual network QPbased primaldual network 4 3n + 1 6n n + 1 6n + 2 LPbased projection network 2 n + 1 2n + 2 QPbased projection network QPbased simplified dual network 2 n + 1 2n n 3n LPbased discontinuous net 1 n 2n QPbased discontinuous network QPbased improved dual network 1 n 2n 1 1 n
30 Simulation Results
31 Discretetime Counterpart
32 Activation Function with High Gain
33 A New Model
34 Desirable Properties The kwta model with Heaviside activation function has been proven to be globally stable and globally convergent to the kwta solutions in finite time. Derived lower and upper bounds of convergence time are respectively It essentially solves the dual problem of the linear programming formulation.
35 Convergence Time As a linear system with a discontinuous bias, the converence time of the kwta network can be computed as a function of input vector u. The expectation and variance of the convergence time can also be computed, based on Binomial distribution, as functions of initial states. Y. Xiao, Y. Liu, C.S. Leung, J. P.F. Sum, K. Ho, Analysis on the convergence time of dual neural networkbased kwta, IEEE Trans. Neural Networks and Learning Systems, vol. 23, pp , J. P.F. Sum, C.S. Leung, K. Ho, Effect of Input Noise and Output Node Stochastic on Wang's kwta, IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp , 2013.
36 Reformulated Problem
37 Reformulated Problem (cont d)
38 Reformulated Problem (cont d)
39 Simulation Results with Randomized Integer Inputs
40 Simulation Results with Low Resolution Inputs
41 Initial State Estimation Although the state of kwta model is guaranteed to be globally convergent in finite time from any initial state, prior information is helpful to initialize the state closely to the steady state. Obviously, the steady state of y (u k+1, u k ] depends on the distribution of u 1, u 2,..., u n, as well as the values of k and n.
42 Initial State Estimation (cont d) General distribution Uniform distribution Normal distribution
43 Initial State Estimation (cont d)
44 Uniform Distribution
45 Normal Distribution
46 Simulation Results (convergence time) with Infinity Gain
47 Simulation Results (convergence time) with Unity Gain
48 Discretetime Version
49 Simulation Results (n = 10 6, k = n/2)
50 Simulation Results (n = 10 6, k = n/2)
51 Monte Carlo Simulation Results
52 Monte Carlo Simulation Results
53 Estimated Complexity (uniform)
54 Estimated Complexity (normal) For data with a dimension of (1 Googol), it would need about 8.44 iterations on average!
55 Histograms of Convergence Iterations
56 Histograms of Convergence Iterations
57 Histograms of Convergence Iterations
58 Histograms of Convergence Iterations
59 Sorting Operation Sorting is a fundamental process to arrange data in an order according to their values. It accounts for 25% of data processing time (Knuth). For sorting with large number or high dimensional data, parallel sorting approaches are more desirable. Numerous sorting algorithms and models have been developed with varied efficiencies.
60 Parallel Sorting Representation For example, a permutation matrix:
61 Parallel Sorting Representation (cont d) A modified version:
62 Logic Reversal A simple logic can be used to flip over the redundant '1' elements after the first '1' in each row; i.e.,
63 Parallel Sorting based on kwta Let each kwta network computes one column of the above sorting matrix from left to right with k increasing from 1 to n  1. Specifically, a WTA network with a single state variable (i.e., k=1) is adopted to determined the largest element of the list. Next, a kwta network with k = 2 computes the second item in the list without recounting the first item.
64 Parallel Sorting based on kwta As such, the whole list of n items can be sorted using n1 kwta networks without the need for computing the last item. As a result, only n1 neurons will be needed. It is a substantial reduction of the model complexity compared with the analog sorting networks with n 2 neurons.
65 Illustrative Example In this case, only five (5) neurons are needed by using five kwta networks here. In contrast, 36 neurons are needed in the analog sorting network (Wang, 1995).
66 Simulation Results (state variable)
67 Simulation Results (output variables)
68 Rankorder Filter Rank order filters are nonlinear filters with many applications including digital image processing, speech processing, coding and digital TV, etc. A rank order filter functions by working by selecting its input with a certain rank as its output. Rank order filters entails substantial processing power to implement, which limits their realtime signal processing applications.
69 Rankorder Filter Based on kwta Nevertheless, rank order filters can benefit from their parallelism realizations. Specifically, a kwta network with k = r is used in parallel to another kwta network with k = r 1 to select the input with its rank order being r.
70 Simulation Results (median filter)
71 Simulation Results (median filter)
72 Simulation Results (median filter)
73 Image Processing Percentage of speckle noise in image 10%
74 Image Filtering (cont d)
75 Image Filtering (cont d)
76 Image Filtering (cont d)
77 Image Filtering (cont d)
78 Image Filtering (cont d) Put the original image into median filter The Original image Original image after median filtering
79 Color Image Filtering Percentage of speckle noise in image 10%
80 Color Image Filtering (cont d) Percentage of speckle noise in image 10%
81 Color Image Filtering (cont d)
82 Color Image Filtering (cont d)
83 Color Image Filtering (cont d)
84 Results & Discussion  Image Processing
85 Color Image Filtering (cont d)
86 Color Image Filtering (cont d)
87 Information Retrieval The efficiency of information retrieval from large database is essential. The techniques for information retrieval from large data sets play a very important role as the size of the worldwide web exceeded possibly more than 30 billion nowadays.
88 Web Information Retrieval There are basically two parts in web information retrieval: One is calculating the weight of all the pages or data. The other is find the most wanted k results with highest weightings. The second one is the topk query or front page problem.
89 A Toy Problem from Wikipedia 7 pages 17 links The PageRank weight of each page and link is provided.
90 Selection Results (k=3) Output vector x=[1,1,0,0,1,0,0] T Pages 1, 2, and 5 are with higher PageRank weights
91 Filmdirectoractorwriter Network Crawled from Wikipedia under the category of English language films 34,279 pages 142,426 links Part of the square adjacency matrix is shown by the figure, where a dot on the i th column and the j th row represents that there is a directed link pointed to the j th page from the i th one. The rest of the matrix is 0.
92 Selection Results (k=10) The answer to this query [3111, 3869, 4058, 4621, 6938, 8974, 10341, 11502, 13320, 15326] T can be easily achieved from the sparse representation of the output vector x = g(u i y(t)), where 10 of the elements are nonzero.
93 Conclusions and Future Works The neurodynamic optimization approaches are demonstrated to be powerful for kwinnerstakeall operations. kwinnerstakeall neural networks provide parallel distributed computational models with guaranteed global convergence to the optimal solutions. Neurodynamic optimization approaches are more suitable for realtime applications with big data. GPUbased implementation is under way. Applications to other problems such as recommender systems are yet to be done.
94 Acknowledgments Prof. Yousheng Xia (Fuzhou University) Prof. Yunong Zhang (Sun Yatsen University) Prof. Xiaolin Hu (Tsinghua University) Prof. Qingshan Liu (Huazhong Univ. of Sci. and Tech.) Dr. Shubao Liu (GE Global Research) Dr. Zheng Yan (Huawei Shannon Laboratory) Mr. Yunpeng Pan (Georgia Institute of Technology) Mr. Zhishan Guo (University of North Carolina) Mr. Shaofu Yang and Miss Xinyi Le (Chinese University of Hong Kong) Many projects funded by the Hong Kong Research Grants Council.
95 Q & A
More information