PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, October 6, 2022

[FIXED] What does NearZeroVar in R?

 October 06, 2022     r, r-caret, statistics, variance     No comments   

Issue

I have rather huge dataset in which I would like to exclude columns with a rather low variance, which is why I would like to use the phrase NearZeroVar. However, I do have some trouble understanding what freqCut and uniqueCut do and how they influence each other. I already read the explanation in R but that does not really help me with this one. If anyone could explain it to me, I would be very thankful!


Solution

If a variable has very little change or variation, it's like a constant and not useful for prediction. This would have close to zero variance, hence the name of the function.

The two parameters do not influence each other, they are there to take care of common scenarios that give rise to variable of near zero variance. The column needs to fail both criteria to be excluded.

Let's use an example:

mat = cbind(1,rep(c(1,2),c(8,1)),rep(1:3,3),1:9)
mat
      [,1] [,2] [,3] [,4]
 [1,]    1    1    1    1
 [2,]    1    1    2    2
 [3,]    1    1    3    3
 [4,]    1    1    1    4
 [5,]    1    1    2    5
 [6,]    1    1    3    6
 [7,]    1    1    1    7
 [8,]    1    1    2    8
 [9,]    1    2    3    9

If we use the default, which calls for 95/5 for most common to 2nd and unique values, you can see only 1st column is taken out:

nearZeroVar(mat)
[1] 1

Let's look at the 2nd column, the most common to second most is 8/1, and it has 2 unique values, making it 2/9 = 0.22. So for this to be filtered out , you need to change the two settings:

nearZeroVar(mat,freqCut=7/1,uniqueCut=30)
[1] 1 2

Lastly, something you most likely should not filter out is column 3 or 4, so column we will filter out when we set something nonsense:

nearZeroVar(mat,freqCut=0.1,uniqueCut=50)
[1] 1 2 3


Answered By - StupidWolf
Answer Checked By - Senaida (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing