PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, August 18, 2022

[FIXED] How to get the optimal number of clusters from the clusGap function as an output?

 August 18, 2022     cluster-analysis, cluster-computing, list, output, r     No comments   

Issue

I have a data frame with 2 variables and I want to use the clusGap function to find the number of clusters that would be optimal to use. This code has a similar result:

library(cluster)    
x <- as.vector(runif(100, 0, 1))
y <- as.vector(runif(100, 0, 1))
df <- data.frame(x, y)
gap_stat <- clusGap(df, FUN = kmeans, nstart = n,
                    K.max = 10, B = 50)
gap_stat

Result:

Clustering Gap statistic ["clusGap"] from call:
clusGap(x = df, FUNcluster = kmeans, K.max = 10, B = 50, nstart = n)
B=50 simulated reference sets, k = 1..10; spaceH0="scaledPCA"
 --> Number of clusters (method 'firstSEmax', SE.factor=1): 1
          logW   E.logW           gap     SE.sim
 [1,] 2.569315 2.584217  0.0149021144 0.03210076
 [2,] 2.285049 2.284537 -0.0005116382 0.03231529
 [3,] 2.053193 2.033653 -0.0195399122 0.03282376
 [4,] 1.839085 1.835590 -0.0034952935 0.03443303
 [5,] 1.691219 1.708479  0.0172603348 0.03419994
 [6,] 1.585084 1.597277  0.0121935992 0.03440672
 [7,] 1.504763 1.496853 -0.0079104306 0.03422321
 [8,] 1.416176 1.405903 -0.0102731340 0.03371149
 [9,] 1.333721 1.323658 -0.0100626869 0.03245958
[10,] 1.253199 1.250366 -0.0028330498 0.03034140

As you can see in line 4, the optimal number of clusters is 1. I would like the function to have 1 as an output. I need the optimal number of outputs to be an object in the environment, such as n is 1.


Solution

Typically such information is somewhere directly inside the object, like gap_stat$nc. To look for it str(gap_stat) would typically suffice.

In this case, however, the above strategy isn't enough. But the fact that you can see your number of interest in the output, means that print.clusGap (because the class of gap_stat is clusGap) will show how to obtain this number. So, inspecting cluster:::print.clusGap leads to

maxSE(f = gap_stat$Tab[, "gap"], SE.f = gap_stat$Tab[, "SE.sim"])
# [1] 1


Answered By - Julius Vainora
Answer Checked By - Candace Johnson (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing