Issue

The basic idea is to add 0 or NA and so add rows where we add NA or 0 in a variables that don't have values in there day.enter image description here

The function should to align the two variables but should work also for price, and at the same time add NA in the variables that does not have the value. enter image description here

This is my Dataset, and I want to add 0 and NA, for example in the period where miss this value in variable futures, and in variable Date.

|DATA | Date | `futures' | | |
|1 2021-12-23 | 2021-12-23 | 1388.17
|2 2021-12-22 | 2021-12-22 | 1432.36
|3 2021-12-21 | 2021-12-21 | 1508.98
|4 2021-12-20 | 2021-12-20 | 1493.13
|5 2021-12-19 | 2021-12-17 | 1379.97
|6 2021-12-18 | 2021-12-16 | 1597.91

The function should work more or less like this:

I thought about a for-loop but I do not able to do.

Thanks a lot for help

Solution

I created some arbitrary data and randomly removed rows. This is edited based on the changes you made to your question. You didn't specifically state this, but I assume that if the first date field is NA, you wanted to keep the price.

library(tidyverse)

daten <- seq(as.Date("2020/10/23"), as.Date("2021/10/12"), "days")
price <- round(runif(350, 1000, 1500), digits = 2)

dff <- data.frame(dateOne = sort(sample(daten, size = 325, replace = F), 
                                 decreasing = T),
                  dateTwo = sort(sample(daten, size = 325, replace = F), 
                                 decreasing = T),
                  futures = sample(price, size = 325, replace = T))

This is based on the assumption that the dates are in order.

ordering <- function(d, dT, df1){ # two fields with dates and the data frame
  
  # get indices of date columns
  tellMe <- which(colnames(df1) %in% c(d, dT))
  
  # create ranking (to return original sorting)
  df1$rank <- 1:nrow(df1)
  
  # separate and sort date columns
                # the first date; the rank
  dc  <<- df1[, c(tellMe[[1]], ncol(df1))] %>% arrange_at(vars(d))

                # everything *except the first date field & rank
  dTc <<- df1[, -c(tellMe[[1]], ncol(df1))] %>% arrange_at(vars(dT))

  # identify the index of the date in dTc
  tellMe2 <- which(colnames(dTc) == {{ dT }})

  # find differences
         # missing in the first date field
             # the index of the date is in tellMe2
  dfd2 <- dTc[!dTc[, tellMe2] %in% dc[, 1], tellMe2] 

         # missing in the second date field
  dfd  <- dc[!dc[, 1] %in% dTc[, tellMe2], 1] # since date is in column 1
  
  # find indices of where the NA's need to be placed
  dcInt <<- lapply(dfd2,
                  findInterval,
                  unlist(dc[, 1])) %>% 
    unlist()
  dTcInt <<- lapply(dfd,
                   findInterval,
                   unlist(dTc[, tellMe2])) %>% 
    unlist()

  # build up with differences as NA
  # preceding index provided, offset by index number - 1
  for(i in 1:length(dcInt)){
    dc <- rbind(dc[0:(dcInt[[i]] + i - 1), ], # everything before
                rep(NA, times = ncol(dc)),
                dc[(dcInt[[i]] + i):nrow(dc), ], # everything after
                make.row.names = F)
  }
  
  # preceding index provided, offset by index number - 1
  for(j in 1:length(dTcInt)){
    dTc <- rbind(dTc[0:(dTcInt[[j]] + j - 1), ], # everything before
                 rep(NA, times = ncol(dTc)), 
                 dTc[(dTcInt[[j]] + j):nrow(dTc), ], # everything after
                 make.row.names = F)
  } 
  
  # reassemble the data, in the original order
  df2 <- cbind(dc, dTc) %>%
    select(colnames(df1), rank)
  
  # check row order
  # if the ranking added has any number 1:10 in the first 10 rows
  if(length(df2[1:10, ]$rank %in% 1:10) == 0){
    # add a new ranking variable
    df2$rank2 <- 1:nrow(df2)
    # reverse the new ranking variable and delete both ranking variables
    df2 <- arrange(df2, -rank2) %>% select(-rank, -rank2)
  } else {
    # delete the ranking variable; they are already in the right order
    df2 <- select(df2, -rank)
  }
  return(df2)
}

Now you can use this function with the data.

tryIt <- ordering("dateOne", "dateTwo", dff)
head(tryIt)

This will return a data frame. Whether sorted with the dates increasing or decreasing when sent to the function, it will return it in the order in which it was sent.

tail(tryIt, n = 15)

Answered By - Kat

Answer Checked By - Timothy Miller (PHPFixing Admin)

Saturday, October 8, 2022

[FIXED] How can I put zero or NA in a dataframe having 2 type of dates using R?

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Saturday, October 8, 2022

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To