PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, October 6, 2022

[FIXED] How to partition into equal sum subsets in R?

 October 06, 2022     data-wrangling, r, statistics     No comments   

Issue

I have a dataset with a column, X1, of various values. I would like to order this dataset by the value of X1, and then partition into K number of equal sum subsets. How can this be accomplished in R? I am able to find quartiles for X1 and append the quartile groupings as a new column to the dataset, however, quartile is not quite what I'm looking for. Thank you in advance!

df <- data.frame(replicate(10,sample(0:1000,1000,rep=TRUE)))
df <- within(df, quartile <- as.integer(cut(X1, quantile(X1, probs=0:4/4), include.lowest=TRUE)))

Solution

Here's a rough solution (using set.seed(47) if you want to reproduce exactly). I calculate the proportion of the sum for each row, and do the cumsum of that proportion, and then cut that into the desired number of buckets.

library(dplyr)

n_groups = 10

df %>% arrange(X1) %>%
  mutate(
    prop = X1 / sum(X1),
    cprop = cumsum(prop),
    bins = cut(cprop, breaks = n_groups - 1)
  ) %>%
  group_by(bins) %>% 
  summarize(
    group_n = n(),
    group_sum = sum(X1)
  )
# # A tibble: 9 × 3
#   bins           group_n group_sum
#   <fct>            <int>     <int>
# 1 (-0.001,0.111]     322     54959
# 2 (0.111,0.222]      141     54867
# 3 (0.222,0.333]      111     55186
# 4 (0.333,0.444]       92     55074
# 5 (0.444,0.556]       80     54976
# 6 (0.556,0.667]       71     54574
# 7 (0.667,0.778]       66     55531
# 8 (0.778,0.889]       60     54731
# 9 (0.889,1]           57     55397

This could of course be simplified--you don't need to keep around the extra columns, just mutate(bins = cut(cumsum(X1 / sum(X1)), breaks = n_groups - 1)) will add the bins column to the original data (and no other columns), and the group_by() %>% summarize() is just to diagnose the result.



Answered By - Gregor Thomas
Answer Checked By - David Marino (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing