PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Tuesday, November 22, 2022

[FIXED] How can I remove rows on more conditions in R?

 November 22, 2022     delete-row, multiple-conditions, r     No comments   

Issue

I have session id's, client id's, a conversion column and all with a specific date. I want to delete the rows after the last purchase of a client. My data looks as follows:

SessionId       ClientId        Conversion         Date
    1               1                0             05-01
    2               1                0             06-01
    3               1                0             07-01
    4               1                1             08-01
    5               1                0             09-01
    6               2                0             05-01 
    7               2                1             06-01
    8               2                0             07-01
    9               2                1             08-01
    10              2                0             09-01

As output I want:

SessionId       ClientId        Conversion         Date
    1               1                0             05-01
    2               1                0             06-01
    3               1                1             07-01
    6               2                0             05-01 
    7               2                1             06-01
    8               2                0             07-01
    9               2                1             08-01

I looks quite easy, but it has some conditions. Based on the client id, the sessions after the last purchase of a cutomer need to be deleted. I have many observations, so deleting after a particular date is not possible. It need to check every client id on when someone did a purchase.

I have no clue what kind of function I need to use for this. Maybe a certain kind of loop?

Hopefully someone can help me with this.


Solution

If your data is already ordered according to Date, for each ClientId we can select all the rows before the last conversion took place.

This can be done in base R :

subset(df, ave(Conversion == 1, ClientId, FUN = function(x) seq_along(x) <= max(which(x))))

Using dplyr :

library(dplyr)
df %>% group_by(ClientId) %>% filter(row_number() <= max(which(Conversion == 1)))

Or data.table :

library(data.table)
setDT(df)[, .SD[seq_len(.N) <= max(which(Conversion == 1))], ClientId]


Answered By - Ronak Shah
Answer Checked By - Gilberto Lyons (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing