# ECE471-571 Lecture 1 - Introduction

ECE471-571 Pattern Recognition Lecture 12 Unsupervised Learning (Clustering) Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi Email: [email protected] Pattern Classification Statistical Approach Supervised Basic concepts: Baysian decision rule (MPP, LR, Discri.)

Non-Statistical Approach Unsupervised Basic concepts: Distance Agglomerative method Parameter estimate (ML, BL) k-means Non-Parametric learning (kNN) Winner-takes-all LDF (Perceptron)

Kohonen maps NN (BP) Mean-shift Decision-tree Syntactic approach Support Vector Machine Deep Learning (DL) Dimensionality Reduction FLD, PCA

Performance Evaluation ROC curve (TP, TN, FN, FP) cross validation Stochastic Methods local opt (GD) global opt (SA, GA) Classifier Fusion majority voting NB, BKS Review - Bayes Decision Rule P (w j | x)= Maximum Posterior

Probability Likelihood Ratio Discriminant Function p(x | w j )P (w j ) p(x) For a given x, if P(w1|x) > P(w2|x), then x belongs to class 1, otherw If , then x belongs to class 1, otherwise, 2. The classifier will assign a feature vector x to class w i if gi (x)> gj (x) Case 1: Minimum Euclidean Distance (Linear Machine), i=2I

Case 2: Minimum Mahalanobis Distance (Linear Machine), i = Case 3: Quadratic classifier , i = arbitrary Non-parametric kNN For a given x, if k1/k > k2/k, then x belongs to class 1, otherwise 2 Estimate Gaussian (Maximum Likelihood Estimation, MLE), Two-modal Gaussian Dimensionality reduction Performance evaluation ROC curve 3

Unsupervised Learning Whats unknown? In the training set, which class does each sample belong to? For the problem in general, how many classes (clusters) is appropriate? 4 Clustering Algorithm Agglomerative clustering Step1: assign each data point in the training set to a separate cluster Step2: merge the two closest clusters Step3: repeat step2 until you get the number of

clusters you want or the appropriate cluster number The result is highly dependent on the measure of cluster distance 5 Distance from a Point to a Cluster Euclidean distance deuc (x, A)= x - m A City block distance Squared Mahalanobis distance T

dmah (x, A)=(x - m A ) -A1 (x - m A ) 6 Distance between Clusters The centroid distance dmean (A, B)= m A - m B Nearest neighbor measure dmin (A, B)=min a,b deuc (a, b) for a A, b B Furthest neighbor measure dmax (A, B)=max a,b deuc (a, b) for a A, b B 7

Example dmax dmin C C B B A A

8 Minimum Spanning Tree Step1: compute all edges in the graph Step2: sort the edges by length Step3: beginning with the shortest edge, for each edge between nodes u and v, perform the following operations: Step3.1: A = find(u) (A is the cluster where u is in) Step3.2: B = find(v) (B is the cluster where v is in) Step3.3: if (A!=B) C=union(A, B), and erase sets A and B 9

Comparison of Shape of Clusters dmin tends to choose clusters which are ?? dmax tends to choose clusters which are ?? 10 Example 11 The k-means Algorithm Step1: Begin with an arbitrary assignment of samples to clusters or begin with an arbitrary set of cluster centers and assign samples to nearest clusters

Step2: Compute the sample mean of each cluster Step3: Reassign each sample to the cluster with the nearest mean Step4: If the classification of all samples has not changed, stop; else go to step 2. 12 Winner-take-all Approach Begin with an arbitrary set of cluster centers wi For each sample x, find the nearest cluster center w, which is called the winner. Modify w using wnew = wold + (x - wold) is known as a learning parameter. Typical values of this parameter are small, on the order of 0.01.

13 Winner-take-all x x-w w 14 *Kohonen Feature Maps (NN) An extension of the winner-take-all algorithm. Also called self-organizing feature maps A problem-dependent topological distance is assumed to exist between each pair of the cluster centers

When the winning cluster center is updated, so are its neighbors in the sense of this topological distance. 15 *SOM A Demo x x-w w w w w

w 16 SOM The Algorithm The winning cluster center and its neighbors are trained based on the following formula wrk+1 =wrk + (k)F(k)(x - wrk ) ware the cluster centers r gware k the coordinate of r

the cluster centers Learning Rate min kmax (as k inc, dec, (k)= max gw winneris the coordinate of more stable) max the winner g - g wr w winner F(k)=exp 2

2 2 The closer (topological closeness) the

neighbor, the more it will be affected. 17 SOM - An Example http://www.ai-junkie.com/ann/som/som1.html 18 *Mean Shift Clustering Originally proposed in 1975 by Fukunaga and Hostetler for mode detection Chengs generalization to solve clustering problems in 1995 Non-parametric no prior knowledge needed about

number of clusters Key parameter: window size Challenging issue: How to determine the right window size? Slow convergence 19 Mean Shift Clustering 1. Initialization: Choose a window/kernel of size h, e.g., a flat kernel, and apply the window on each data point, x 2. Mean calculation: Within each window centered at x, compute the mean of data, where x is the set of points enclosed within window h 3. Mean shift: Shift the window to the mean, i.e., x=m(x), where the

difference m(x)-x is referred to as the mean shift. 4. If ||m(x)-x||>, go back to step 2. 20 Original | RGB plot k=10, 24, 64 (kmeans) Init (random | uniform), k=6 Mean shift (h=30 14 clusters) Mean shift (h=20 35 clusters) 21

## Recently Viewed Presentations

• HISTORICAL GEOLOGY What does Historical Geology deal with? Earth's origin and evolution Changes in the distribution of lands and seas, Growth and reduction of mountains, and Succession of animals and plants through time Why is Historical Geology important? The present...
• Hamlet's Soliloquy in Act 1 Scene 2. Find the quotations in Hamlet's soliloquy that mean the following: He thinks of life as foul and tedious. He thinks of his father's powerful love for his mother. He remembers how passionately his...
• Lamiaceae: Mint Family. Square stalks, simple opposite leaves. Common Constiuent: Aromatic/Volatile Oils (anti-microbial, promote sweating) 5 sepals (fused so only the tips are separate), 5 fused petals (asymmetrical and irregular sizes, but usually 2 petals up, 3 petals down.
• How EM Waves are Produced. ... Light is a major energy source for our planet. Plants use it to make food. Animals eat these plants. ... Television has color because each "dot" has a group of red, green, and blue...
• Constantia Arial Calibri Wingdings 2 Times New Roman Flow 1_Flow 2_Flow 3_Flow "Exploring Electricity" Agenda Squishy Circuits Additional Activities with Clay Batteries-Voltaic Piles Voltaic cells, Penny activities Batteries- Lemon Cells EKI Electronic Kits Additional information for Mr. Circuit Thank you
• GENERAL FINANCE INFORMATION. A W9 and an ISU Non-Cash Prize or Award Documentation form are required for all gift cards and/or prizes such as iPads, mini fridges, giftbaskets, etc
• To synthesise the findings an insights workshop was held. This workshop was attended by front line service providers including Woden community services, Victim Support, Relationships Australia, Everyman, Legal Aid, DVCS, West Belconnen Child and Family Centre, Toora, YWCA. From the...
• Special Case of International Relations or Special Actor in International Relations? EU teacher training 9 July 2010, UNC Chapel Hill ... EU institutions, and the external world A Special Case of IR? (continued) Constructivist Approach: EU foreign policy is rooted...