A new approach to unsupervised learning
Evolving technologies have brought about an explosion of information in recent years, but the question of how such information might be effectively harvested, archived, and analyzed remains a monumental challengefor the processing of such information is often fraught with the need for conceptual interpretation: a relatively simple task for humans, yet an arduous one for computers.
Inspired by the relative success of existing popular research on self-organizing neural networks for data clustering and feature extraction,Unsupervised Learning: A Dynamic Approach presents information within the family of generative, self-organizing maps, such as the self-organizing tree map (SOTM) and the more advanced self-organizing hierarchical variance map (SOHVM). It covers a series of pertinent, real-world applications with regard to the processing of multimedia datafrom its role in generic image processing techniques, such as the automated modeling and removal of impulse noise in digital images, to problems in digital asset management and its various roles in feature extraction, visual enhancement, segmentation, and analysis of microbiological image data.
Self-organization concepts and applications discussed include:
Distance metrics for unsupervised clusteringSynaptic self-amplification and competitionImage retrievalImpulse noise removalMicrobiological image analysis
Unsupervised Learning: A Dynamic Approach introduces a new family of unsupervised algorithms that have a basis in self-organization, making it an invaluable resource for researchers, engineers, and scientists who want to create systems that effectively model oppressive volumes of data with little or no user intervention.
Acknowledgments xi
1 Introduction 1
1.1 Part I: The Self-Organizing Method 1
1.2 Part II: Dynamic Self-Organization for Image Filtering and Multimedia Retrieval 2
1.3 Part III: Dynamic Self-Organization for Image Segmentation and Visualization 5
1.4 Future Directions 7
2 Unsupervised Learning 9
2.1 Introduction 9
2.2 Unsupervised Clustering 9
2.3 Distance Metrics for Unsupervised Clustering 11
2.4 Unsupervised Learning Approaches 13
2.4.1 Partitioning and Cluster Membership 13
2.4.2 Iterative Mean-Squared Error Approaches 15
2.4.3 Mixture Decomposition Approaches 17
2.4.4 Agglomerative Hierarchical Approaches 18
2.4.5 Graph-Theoretic Approaches 20
2.4.6 Evolutionary Approaches 20
2.4.7 Neural Network Approaches 21
2.5 Assessing Cluster Quality and Validity 21
2.5.1 Cost FunctionBased Cluster Validity Indices 22
2.5.2 Density-Based Cluster Validity Indices 23
2.5.3 Geometric-Based Cluster Validity Indices 24
3 Self-Organization 27
3.1 Introduction 27
3.2 Principles of Self-Organization 27
3.2.1 Synaptic Self-Amplification and Competition 27
3.2.2 Cooperation 28
3.2.3 Knowledge Through Redundancy 29
3.3 Fundamental Architectures 29
3.3.1 Adaptive Resonance Theory 29
3.3.2 Self-Organizing Map 37
3.4 Other Fixed Architectures for Self-Organization 43
3.4.1 Neural Gas 44
3.4.2 Hierarchical Feature Map 45
3.5 Emerging Architectures for Self-Organization 46
3.5.1 Dynamic Hierarchical Architectures 47
3.5.2 Nonstationary Architectures 48
3.5.3 Hybrid Architectures 50
3.6 Conclusion 50
4 Self-Organizing Tree Map 53
4.1 Introduction 53
4.2 Architecture 54
4.3 Competitive Learning 55
4.4 Algorithm 57
4.5 Evolution 61
4.5.1 Dynamic Topology 61
4.5.2 Classification Capability 64
4.6 Practical Considerations, Extensions, and Refinements 68
4.6.1 The Hierarchical Control Function 68
4.6.2 Learning, Timing, and Convergence 71
4.6.3 Feature Normalization 73
4.6.4 Stop Criteria 73
4.7 Conclusions 74
5 Self-Organization in Impulse Noise Removal 75
5.1 Introduction 75
5.2 Review of Traditional Median-Type Filters 76
5.3 The Noise-Exclusive Adaptive Filtering 82
5.3.1 Feature Selection and Impulse Detection 82
5.3.2 Noise Removal Filters 84
5.4 Experimental Results 86
5.5 Detection-Guided Restoration and Real-Time Processing 99
5.5.1 Introduction 99
5.5.2 Iterative Filtering 101
5.5.3 Recursive Filtering 104
5.5.4 Real-Time Processing of Impulse Corrupted TV Pictures 105
5.5.5 Analysis of the Processing Time 109
5.6 Conclusions 115
6 Self-Organization in Image Retrieval 119
6.1 Retrieval of Visual Information 120
6.2 Visual Feature Descriptor 122
6.2.1 Color Histogram and Color Moment Descriptors 122
6.2.2 Wavelet Moment and Gabor Texture Descriptors 123
6.2.3 Fourier and Moment-based Shape Descriptors 125
6.2.4 Feature Normalization and Selection 127
6.3 User-Assisted Retrieval 130
6.3.1 Radial Basis Function Method 132
6.4 Self-Organization for Pseudo Relevance Feedback 136
6.5 Directed Self-Organization 140
6.5.1 Algorithm 142
6.6 Optimizing Self-Organization for Retrieval 146
6.6.1 Genetic Principles 147
6.6.2 System Architecture 149
6.6.3 Genetic Algorithm for Feature Weight Detection 150
6.7 Retrieval Performance 153
6.7.1 Directed Self-Organization 153
6.7.2 Genetic Algorithm Weight Detection 155
6.8 Summary 157
7 The Self-Organizing Hierarchical Variance Map 159
7.1 An Intuitive Basis 160
7.2 Model Formulation and Breakdown 162
7.2.1 Topology Extraction via Competitive Hebbian Learning 163
7.2.2 Local Variance via Hebbian Maximal Eigenfilters 165
7.2.3 Global and Local Variance Interplay for Map Growth and Termination 170
7.3 Algorithm 173
7.3.1 Initialization, Continuation, and Presentation 173
7.3.2 Updating Network Parameters 175
7.3.3 Vigilance Evaluation and Map Growth 175
7.3.4 Topology Adaptation 176
7.3.5 Node Adaptation 177
7.3.6 Optional Tuning Stage 177
7.4 Simulations and Evaluation 177
7.4.1 Observations of Evolution and Partitioning 178
7.4.2 Visual Comparisons with Popular Mean-Squared Error Architectures 181
7.4.3 Visual Comparison Against Growing Neural Gas 183
7.4.4 Comparing Hierarchical with Tree-Based Methods 183
7.5 Tests on Self-Determination and the Optional Tuning Stage 187
7.6 Cluster Validity Analysis on Synthetic and UCI Data 187
7.6.1 Performance vs. Popular Clustering Methods 190
7.6.2 IRIS Dataset 192
7.6.3 WINE Dataset 195
7.7 Summary 195
8 Microbiological Image Analysis Using Self-Organization 197
8.1 Image Analysis in the Biosciences 197
8.1.1 Segmentation: The Common Denominator 198
8.1.2 Semi-supervised versus Unsupervised Analysis 199
8.1.3 Confocal Microscopy and Its Modalities 200
8.2 Image Analysis Tasks Considered 202
8.2.1 Visualising Chromosomes During Mitosis 202
8.2.2 Segmenting Heterogeneous Biofilms 204
8.3 Microbiological Image Segmentation 205
8.3.1 Effects of Feature Space Definition 207
8.3.2 Fixed Weighting of Feature Space 209
8.3.3 Dynamic Feature Fusion During Learning 213
8.4 Image Segmentation Using Hierarchical Self-Organization 215
8.4.1 Gray-Level Segmentation of Chromosomes 215
8.4.2 Automated Multilevel Thresholding of Biofilm 220
8.4.3 Multidimensional Feature Segmentation 221
8.5 Harvesting Topologies to Facilitate Visualization 226
8.5.1 Topology Aware Opacity and Gray-Level Assignment 227
8.5.2 Visualization of Chromosomes During Mitosis 228
8.6 Summary 233
9 Closing Remarks and Future Directions 237
9.1 Summary of Main Findings 237
9.1.1 Dynamic Self-Organization: Effective Models for Efficient Feature Space Parsing 237
9.1.2 Improved Stability, Integrity, and Efficiency 238
9.1.3 Adaptive Topologies Promote Consistency and Uncover Relationships 239
9.1.4 Online Selection of Class Number 239
9.1.5 Topologies Represent a Useful Backbone for Visualization or Analysis 240
9.2 Future Directions 240
9.2.1 Dynamic Navigation for Information Repositories 241
9.2.2 Interactive Knowledge-Assisted Visualization 243
9.2.3 Temporal Data Analysis Using Trajectories 245
Appendix A 249
A.1 Global and Local Consistency Error 249
References 251
Index 269