PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Tuesday, November 1, 2022

[FIXED] How to group data using two columns in r

 November 01, 2022     data-manipulation, dataframe, group-by, pivot, r     No comments   

Issue

I have sample data

s_name <- c("AL", "AL", "CA", "CA", "WI", "WI", "NJ", "NJ", "UT", "UT")
n_unit <- c(40, 30, 150, 110, 45, 80, 70, 40, 50, 90)
li_unit <- c(30, 30, 70, 40, 15, 80, 50, 40, 45, 45)
pv_lvl <- c("High", "Very High", "Medium", "Low", "Very Low", "Medium", "Very High", "low", "Very Low", "High")

mydata <- as.data.frame(cbind(s_name, n_unit, li_unit, pv_lvl))
mydata$n_unit <- as.numeric(mydata$n_unit)
mydata$li_unit <- as.numeric(mydata$li_unit)

mydata$per_li = mydata$li_unit/mydata$n_unit*100

print(mydata)

What I am trying to generate is a table that shows the percent of li_unit in each type of pv_lvl grouped by s_name.
Something like this:

s_name Very Low    Low    Medium   High    Very High
AL     0.00        0.00   0.00     75.00   100.00 
CA     0.00        36.36  46.67    0.00    0.00
WI     33.33       0.00   100.00   0.00    0.00 
NJ     100.00      0.00   0.00     0.00    71.43
UT     90.00       0.00   0.00     50.00   0.00

I tried the group_by function but did not get the desired result.
Thanks for your time and help!


Solution

Here's the standard pivot_wider way:

library(tidyr)
mydata$pv_lvl <- factor(mydata$pv_lvl, levels = c("Very Low", "Low", "Medium", "High", "Very High"))

pivot_wider(mydata, s_name, 
            names_from = pv_lvl, 
            values_from = per_li, 
            values_fill = 0,
            names_sort = TRUE)

output

# A tibble: 5 × 6
  s_name `Very Low`   Low Medium  High `Very High`
  <chr>       <dbl> <dbl>  <dbl> <dbl>       <dbl>
1 AL            0     0      0      75       100  
2 CA            0    36.4   46.7     0         0  
3 WI           33.3   0    100       0         0  
4 NJ            0   100      0       0        71.4
5 UT           90     0      0      50         0  

Note that you don't have to have to use cbind to create the dataframe, mydata <- data.frame(s_name, n_unit, li_unit, pv_lvl) is enough. Also, I created a factor out of pv_lvl so that you can sort them as desired using names_sort in pivot_wider.



Answered By - Maël
Answer Checked By - Cary Denson (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing