Wednesday, April 27, 2022

[FIXED] Why do I get length warning when I use a function through dplyr::mutate, when the function works well standalone?

Issue

I have been trying to add a new column to an existing data frame, with the help of a function that takes into account a 'double' vector type as well as a column of the said data frame. A reproducible code would be,

library('tidyverse')
set.seed(123)
b <- sort(sample(seq(20, 50, by=0.5), size=50))
f <- function(a) sum((b >= a)&(b<a+5), na.rm=TRUE)

x <- c(21, 23, 27, 31, 37, 39)
y <- c(23, 26, 29, 32, 39, 45)
XY <- data.frame(x, y)

XY %>% mutate(c= f(x))

In my problem, the length of b is 4321 and of XY$x and XY$y is 180. When I run the function f on its own for various inputs, I get results without any problem. As soon as I use it with mutate, not only do I get the length warning,

Warning message:
“Problem with `mutate()` input `c`.
ℹ longer object length is not a multiple of shorter object length

but the results in the mutated column c aren't accurate either. My guess is that the length of b is causing this problem, but the function f only has to count the entries from b that fall within the given conditions, so why do I get the warning about lengths. I'd like to understand what's going on with the warnings and how to work around it?


Solution

The function is not vectorized you need to apply it rowwise :

library(dplyr)
XY %>% rowwise() %>% mutate(c = f(x))


Answered By - Ronak Shah
Answer Checked By - Mary Flores (PHPFixing Volunteer)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.