Friday, October 7, 2022

[FIXED] How to count sequential repeats of a value (allowing for one skip)

Issue

I am trying to produce a variable that counts how many times a "1" appeared sequentially in the preceding rows for a different variable. However, I need the count to persist even if there is one row missing a 1. (i.e., 10111011 should register as an 8). The code I use to count sequential 1s is:

The following code provides an example of the kind of thing I'm trying to do:

input <- c(1,0,1,1,0,1,1,0,1,0,1)
dfseq <- data.frame(input)
dfseq$seq <- sequence(rle(as.character(dfseq$input))$lengths)

which produces the following dataframe:

data_struc <-
  structure(list(
    input = c(1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1),
    seq = c(1L,
            1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L)
  ),
  row.names = c(NA,-11L),
  class = "data.frame")

However, I want the sequence to allow for one row of "failure" on the sequence, such that it continues to count consecutive ones even if one row contains a 0 and then the 1s continue. It should only stop counting once two 0s appear consecutively


Solution

I'd use a lagged variable with an OR condition:

library(dplyr)
dfseq %>% mutate(
  cum_result = cumsum(input == 1 | (lag(input) == 1 & lead(input, default = 1) == 1))
)
#    input seq cum_result
# 1      1   1          1
# 2      0   1          2
# 3      1   1          3
# 4      1   2          4
# 5      0   1          5
# 6      1   1          6
# 7      1   2          7
# 8      0   1          8
# 9      1   1          9
# 10     0   1         10
# 11     1   1         11


Answered By - Gregor Thomas
Answer Checked By - Senaida (PHPFixing Volunteer)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.