Showing posts with label na. Show all posts

Tuesday, November 15, 2022

[FIXED] Why is this conditional statement in R dplyr generating an error message when it is generating the correct output and there are no NAs?

November 15, 2022 error-handling, if-statement, na, r No comments

Issue

Please refer to Mikko Marttila answer below where he highlights the core issue with a better example. You don't need to waste your time going through all this OP gibberish.

I am working on a function with a for-loop and have broken it down into steps, as this my first ever for-loop.

The first section of code below, shown at the very bottom, generates a data frame called nCode and it is fine and produces no errors (leave the for-loop at i in 1:1 !!, just run the code without changes).

But when I run this second bit of code simulating the beginning of the 2nd loop run on the nCode data frame, it outputs fine but I get the error message "Problem with mutate() column concat_2. i concat_2 = ifelse(...). i NAs introduced by coercion". I can't see what's wrong with the ifelse(), it looks legit to me. Here's that second bit of code (to run after the first section of code is run):

i = 2
reSeq_prior <- str_c("reSeq_",i-1)
concat_col <- str_c("concat_",i)

nCode <- if(i==1){
  nCode %>% mutate(!! concat_col:= as.numeric(paste0(seqBase,".",grpRnk)))} else {
    nCode %>% mutate(!! concat_col:= ifelse(
      !!rlang::sym(reSeq_prior)%%1 > 0, 
      !!rlang::sym(reSeq_prior),
      as.numeric(paste0(!!rlang::sym(reSeq_prior),".",grpRnk))
      )
    )
  }
nCode

Here's the good output I get when running these two sections of code (I am resisting my urge to use suppressWarnings(), I'd rather understand the problem):

> nCode
# A tibble: 15 x 10
   Name  Group nmCnt seqBase grpRnk concat_1 alloc_1 merge_1 reSeq_1 concat_2
   <chr> <dbl> <int>   <int>  <dbl>    <dbl>   <dbl>   <dbl>   <dbl>    <dbl>
 1 R         0     1       1      0      1       1       1       1        1  
 2 R         0     2       2      0      2       2.1     2.1     2.1      2.1
 3 X         0     1       1      0      1       1       1       1        1  
 4 X         1     2       2      1      2.1     2.1     2.1     2.1      2.1
 5 X         1     3       2      2      2.2     2.2     2.2     2.2      2.2
 6 X         0     4       3      0      3      NA       3       3        3  
 7 X         0     5       4      0      4      NA       4       4        4  
 8 X         0     6       5      0      5      NA       5       5        5  
 9 B         0     1       1      0      1       1       1       1        1  
10 R         0     3       3      0      3       2.2     2.2     2.2      2.2
11 R         2     4       4      1      4.1    NA       4       3        3.1
12 R         2     5       4      2      4.2    NA       4       3        3.2
13 X         3     7       6      1      6.1    NA       6       6        6.1
14 X         3     8       6      2      6.2    NA       6       6        6.2
15 X         3     9       6      3      6.3    NA       6       6        6.3

First section of code:

library(dplyr)
library(stringr)

myDF1 <-
  data.frame(
    Name = c("R","R","X","X","X","X","X","X","B","R","R","R","X","X","X"),
    Group = c(0,0,0,1,1,0,0,0,0,0,2,2,3,3,3)
  )

nCode <-  myDF1 %>%
  group_by(Name) %>%
  mutate(nmCnt = row_number()) %>%
  ungroup() %>%
  mutate(seqBase = ifelse(Group == 0 | Group != lag(Group), nmCnt,0)) %>%
  mutate(seqBase = na_if(seqBase, 0)) %>%
  group_by(Name) %>%
  fill(seqBase) %>%
  mutate(seqBase = match(seqBase, unique(seqBase))) %>%
  ungroup %>%
  mutate(grpRnk = ifelse(Group > 0, sapply(1:n(), function(x) sum(Name[1:x]==Name[x] & Group[1:x] == Group[x])),0))
  
loopCntr <- nrow(unique(myDF1[myDF1$Group!=0,]))
  
for(i in 1:1) {
  
  reSeq_prior <- str_c("reSeq_",i-1)
    
  concat_col <- str_c("concat_",i)

  nCode <- if(i==1){
    nCode %>% mutate(!! concat_col:= as.numeric(paste0(seqBase,".",grpRnk)))} else {
      nCode %>% mutate(!! concat_col:= as.numeric(paste0(!!rlang::sym(reSeq_prior),".",grpRnk)))
    }
  
    index <- filter(nCode, Group !=0) %>%
      select(all_of(concat_col)) %>%
      distinct() %>%
      mutate(truncInd = trunc(get(concat_col))) %>%
      group_by(truncInd) %>%
      mutate(cumGrp = cur_group_id()) %>%
      ungroup() %>%
      select(-truncInd,cumGrp,concat_col)
  
  # below inserts a 1 in 1st row of index if lowest element count group is >= 2  
    index <- if(ifelse(loopCntr > 0, min(index[[concat_col]]), Inf) >= 2){
      tmp <- data.frame(cumGrp = 1, concat = 1)
      names(tmp)[2] <- concat_col
      rbind(tmp,index)
    }else{index}
    
    nCode <- nCode %>%
      mutate(alloc = index[[concat_col]][index$cumGrp==1][nmCnt]) %>%
      mutate(merge = ifelse(is.na(alloc),seqBase,alloc)) %>%
      group_by(Name) %>%
      mutate(reSeq = match(trunc(merge), unique(trunc(merge)))) %>%
      mutate(reSeq = (reSeq + round(merge%%1 * 10,0)/10)) %>%
      ungroup()
    
    nCode <- nCode %>%
      rename_with(~ str_c(.x, "_", i), c("alloc", "merge", "reSeq"))
   
  }

Solution

Both branches are evaluated in ifelse(). The warnings are generated from NAs in the no branch, even though the final result will include the value from the yes branch.

Here’s a simplified example:

a <- c(1.1, 2.1, 1)
b <- c(0, 2, 1)

ifelse(a != trunc(a), a, as.numeric(paste0(a, ".", b)))
#> Warning in ifelse(a != trunc(a), a, as.numeric(paste0(a, ".", b))): NAs
#> introduced by coercion
#> [1] 1.1 2.1 1.1

Which is essentially equivalent to:

y <- a
n <- as.numeric(paste0(a, ".", b))
#> Warning: NAs introduced by coercion

ifelse(a != trunc(a), y, n)
#> [1] 1.1 2.1 1.1

To avoid the warning, write code that won’t generate warnings in either branch:

ifelse(a != trunc(a), a, a + b / 10)
#> [1] 1.1 2.1 1.1

Answered By - Mikko Marttila

Answer Checked By - Mildred Charles (PHPFixing Admin)

[FIXED] How to filter across any columns for na and empty value but not 0

October 28, 2022 filter, is-empty, na, r No comments

Issue

Here's my dataframe:

df1 = structure(list(item = c("HY04SB", "GSP8Y1", "8OK8N6", "V2RIP7", 
"51H9V8", "", "5C45YN", "PM271I", "4WVDD9"), weird = c("", "", 
"", "v1b+m#|1", "f%nfw+j<", "[3-qzg76", "13k{-ftr", "", "sywf|*l!"
), simple = c(14661746L, NA, 88171210L, NA, 0L, 35586016L, NA, 
0L, 23761616L), code = c("WX&}Awx:65Dgn9A3", "0", "7jcP!&EAJFT=4=Xv", 
"}7p92w~STX>2M5TP", "", "EvEH+hV=}6X,aS'Q", "", "r*C'U9LA\"tr$p_X;", 
"0")), class = "data.frame", row.names = c(NA, -9L))

> df1 
    item    weird   simple             code
1 HY04SB          14661746 WX&}Awx:65Dgn9A3
2 GSP8Y1                NA                0
3 8OK8N6          88171210 7jcP!&EAJFT=4=Xv
4 V2RIP7 v1b+m#|1       NA }7p92w~STX>2M5TP
5 51H9V8 f%nfw+j<        0                 
6        [3-qzg76 35586016 EvEH+hV=}6X,aS'Q
7 5C45YN 13k{-ftr       NA                 
8 PM271I                 0 r*C'U9LA"tr$p_X;
9 4WVDD9 sywf|*l! 23761616                0

I'm not sure how to make the below syntax to work in order to retrieve any rows that contain empty value and na but not 0

df1_na_empty_but_not_zero <- df1 %>% 
          filter(if_any(item: code %in% c(~ is.na(.), ~ !.x == "0", ~.x == "")))

Expected output:

item    weird   simple             code
1 HY04SB          14661746 WX&}Awx:65Dgn9A3
3 8OK8N6          88171210 7jcP!&EAJFT=4=Xv
4 V2RIP7 v1b+m#|1       NA }7p92w~STX>2M5TP
6        [3-qzg76 35586016 EvEH+hV=}6X,aS'Q
7 5C45YN 13k{-ftr       NA

Can someone help please? Thanks.

Solution

You need to set 2 conditions:

library(dplyr)

df1 %>%
  filter(if_any(item:code, ~ .x == "" | is.na(.x)),
         if_all(item:code, ~ .x != 0  | is.na(.x)))

#     item    weird   simple             code
# 1 HY04SB          14661746 WX&}Awx:65Dgn9A3
# 2 8OK8N6          88171210 7jcP!&EAJFT=4=Xv
# 3 V2RIP7 v1b+m#|1       NA }7p92w~STX>2M5TP
# 4        [3-qzg76 35586016 EvEH+hV=}6X,aS'Q
# 5 5C45YN 13k{-ftr       NA

Answered By - Darren Tsai

Answer Checked By - Robin (PHPFixing Admin)

[FIXED] How do I replace "NA" with "missing" when using CSV.read in Julia?

August 28, 2022 csv, julia, missing-data, na No comments

Issue

I have a csv file with a few NAs sprinkled in. Due to their presence, the columns containing the NAs are classified as strings rather than floats.

I just want to read the csv file with NAs in a way that Julia recognizes "NA" as a missing value rather than a string "NA." I tried the solution in this post; however, I get the following error:

ERROR: MethodError: no method matching CSV.File(::string; null="NA")

Any ideas on how to remedy this problem? Thank you.

Solution

Use the missingstring="NA" keyword argument as described in the documentation.

Answered By - Bogumił Kamiński

Answer Checked By - Terry (PHPFixing Volunteer)

[FIXED] How do I make my custom function return an error message if one of the vector elements has NA or is not an integer in R?

April 28, 2022 function, integer, na, r, warnings No comments

Issue

What I want to do is a function where x is a vector, and y any integer. If y is inside the vector x, then it should return "TRUE". Also, if the vector contains NAs or decimals, then it should print an error message.

So far I have created this, but if I input search(c(9,8,3,NA),3) it gives me this message:

Warning message:
In if (x%%1 != 0 | anyNA(x) == TRUE) { :
  the condition has length > 1 and only the first element will be used

If if input a vector with a decimal in it like this search(c(8,9,7.01,12),12)it won't give an ERROR message.

This is my code so far:

search <- function(x,y){
  if (x%%1!=0 | anyNA(x)==TRUE){
    print("ERROR")
  }else{
    if(y %in% x){
      print(TRUE)
    }
    else
      print(FALSE)
  }
}

Solution

If you want your function to produce an error, use stop, not print. Any program that relies on the output of the function will otherwise keep running, without noticing anything is wrong. This could make things very hard to debug later. stop throws an error, which can then be handled appropriately. Also, because the function will exit if the condition is met, you don't need an else afterwards: that code will only ever run if the condition isn't met, so the else is redundant.

You can also simplify some of the logic. You don't need if(condition == TRUE), since if(condition) does the same thing. Finally, the construction if(condition){ print(TRUE) } else { print(FALSE) } is logically identical to print(condition)

search <- function(x, y){
  if (any(x %% 1 != 0) | anyNA(x) | length(y) != 1) stop("Error")
  y %in% x
}

Now try it on test cases:

search(c(1, 3, 5), 3)
#> [1] TRUE
search(c(1, 3, 5), 2)
#> [1] FALSE
search(c(1, 3, NA), 3)
#> Error in search(c(1, 3, NA), 3): Error
search(c(1, 3, 5.1), 3)
#> Error in search(c(1, 3, 5.1), 3): Error
search(c(1, 3, 5), c(1, 3))
#> Error in search(c(1, 3, 5), c(1, 3)): Error

^{Created on 2020-05-15 by the reprex package (v0.3.0)}

Answered By - Allan Cameron

Answer Checked By - Senaida (PHPFixing Volunteer)

Tuesday, November 15, 2022

[FIXED] Why is this conditional statement in R dplyr generating an error message when it is generating the correct output and there are no NAs?

Issue

Solution

Friday, October 28, 2022

[FIXED] How to filter across any columns for na and empty value but not 0

Issue

Solution

Sunday, August 28, 2022

[FIXED] How do I replace "NA" with "missing" when using CSV.read in Julia?

Issue

Solution

Thursday, April 28, 2022

[FIXED] How do I make my custom function return an error message if one of the vector elements has NA or is not an integer in R?

Issue

Solution

Total Pageviews

Featured Post

Why Learn PHP Programming

Tuesday, November 15, 2022

Issue

Solution

Friday, October 28, 2022

Issue

Solution

Sunday, August 28, 2022

Issue

Solution

Thursday, April 28, 2022

Issue

Solution

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To