Multidimensional Particle Swarm Optimization for Machine Learning
Research output: Book/Report › Doctoral thesis › Collection of Articles
|Publisher||Tampere University of Technology|
|Number of pages||69|
|Publication status||Published - 24 Feb 2017|
|Publication type||G5 Doctoral dissertation (article)|
|Name||Tampere University of Technology. Publication|
Multiswarm versions of MD-PSO and FGBF are introduced to perform dynamic optimization tasks. In dynamic optimization, the search space slowly changes. The locations of optima move and a former local optimum may transform into a global optimum and vice versa. We exploit multiple swarms to track different optima.
In order to apply MD-PSO for clustering tasks, two key questions need to be answered: 1) How to encode the particles to represent different data partitions? 2) How to evaluate the ﬁtness of the particles to evaluate the quality of the solutions proposed by the particle positions? The second question is considered especially carefully in this thesis. An extensive comparison of Clustering Validity Indices (CVIs) commonly used as ﬁtness functions in Particle Swarm Clustering (PSC) is conducted. Furthermore, a novel approach to carry out ﬁtness evaluation, namely Fitness Evaluation with Computational Centroids (FECC) is introduced. FECC gives the same ﬁtness to any particle positions that lead to the same data partition. Therefore, it may save some computational efforts and, above all, it can signiﬁcantly improve the results obtained by using any of the best performing CVIs as the PSC ﬁtness function.
MD-PSO can also be used to evolve different neural networks. The results of training Multilayer Perceptrons (MLPs) using the common Backpropagation (BP) algorithm and a global technique based on PSO are compared. The pros and cons of BP and (MD-)PSO in MLP training are discussed. For training Radial Basis Function Neural Networks (RBFNNs), a novel technique based on class-speciﬁc clustering of the training samples is introduced. The proposed approach is compared to the common input and input-output clustering approaches and the beneﬁts of using the class-speciﬁc approach are experimentally demonstrated. With the class-speciﬁc approach, the training complexity is reduced, while the classiﬁcation performance of the trained RBFNNs may be improved.
Collective Network of Binary Classiﬁers (CNBC) is an evolutionary semantic classiﬁer consisting of several Networks of Binary Classiﬁers (NBCs) trained to recognize a certain semantic class. NBCs in turn consist of several Binary Classiﬁers (BCs), which are trained for a certain feature type. Thanks to its topology and the use of MD-PSO as its evolution technique, incremental training can be easily applied to add new training items, classes, and/or features.
In feature synthesis, the objective is to exploit ground truth information to transform the original low-level features into more discriminative ones. To learn an efficient synthesis for a dataset, only a fraction of the data needs to be labeled. The learned synthesis can then be applied on unlabeled data to improve classiﬁcation or retrieval results. In this thesis, two different feature synthesis techniques are introduced. In the ﬁrst one, MD-PSO is directly used to ﬁnd proper arithmetic operations to be applied on the elements of the original low-level feature vectors. In the second approach, feature synthesis is carried out using one-against-all perceptrons. In the latter technique, the best results were obtained when MD-PSO was used to train the perceptrons.
In all the mentioned applications excluding MLP training, MD-PSO is used together with FGBF. Overall, MD-PSO and FGBF are indeed versatile tools in machine learning. However, computational limitations constrain their use in currently emerging machine learning systems operating on Big Data. Therefore, in the future, it is necessary to divide complex tasks into smaller subproblems and to conquer the large problems via solving the subproblems where the use of MD-PSO and FGBF becomes feasible. Several applications discussed in this thesis already exploit the divide-and-conquer operation model.