PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Friday, May 13, 2022

[FIXED] How to append rows to an R data frame

 May 13, 2022     append, dataframe, merge, r, rows     No comments   

Issue

I have looked around StackOverflow, but I cannot find a solution specific to my problem, which involves appending rows to an R data frame.

I am initializing an empty 2-column data frame, as follows.

df = data.frame(x = numeric(), y = character())

Then, my goal is to iterate through a list of values and, in each iteration, append a value to the end of the list. I started with the following code.

for (i in 1:10) {
    df$x = rbind(df$x, i)
    df$y = rbind(df$y, toString(i))
}

I also attempted the functions c, append, and merge without success. Please let me know if you have any suggestions.

Update from comment: I don't presume to know how R was meant to be used, but I wanted to ignore the additional line of code that would be required to update the indices on every iteration and I cannot easily preallocate the size of the data frame because I don't know how many rows it will ultimately take. Remember that the above is merely a toy example meant to be reproducible. Either way, thanks for your suggestion!


Solution

Update

Not knowing what you are trying to do, I'll share one more suggestion: Preallocate vectors of the type you want for each column, insert values into those vectors, and then, at the end, create your data.frame.

Continuing with Julian's f3 (a preallocated data.frame) as the fastest option so far, defined as:

# pre-allocate space
f3 <- function(n){
  df <- data.frame(x = numeric(n), y = character(n), stringsAsFactors = FALSE)
  for(i in 1:n){
    df$x[i] <- i
    df$y[i] <- toString(i)
  }
  df
}

Here's a similar approach, but one where the data.frame is created as the last step.

# Use preallocated vectors
f4 <- function(n) {
  x <- numeric(n)
  y <- character(n)
  for (i in 1:n) {
    x[i] <- i
    y[i] <- i
  }
  data.frame(x, y, stringsAsFactors=FALSE)
}

microbenchmark from the "microbenchmark" package will give us more comprehensive insight than system.time:

library(microbenchmark)
microbenchmark(f1(1000), f3(1000), f4(1000), times = 5)
# Unit: milliseconds
#      expr         min          lq      median         uq         max neval
#  f1(1000) 1024.539618 1029.693877 1045.972666 1055.25931 1112.769176     5
#  f3(1000)  149.417636  150.529011  150.827393  151.02230  160.637845     5
#  f4(1000)    7.872647    7.892395    7.901151    7.95077    8.049581     5

f1() (the approach below) is incredibly inefficient because of how often it calls data.frame and because growing objects that way is generally slow in R. f3() is much improved due to preallocation, but the data.frame structure itself might be part of the bottleneck here. f4() tries to bypass that bottleneck without compromising the approach you want to take.


Original answer

This is really not a good idea, but if you wanted to do it this way, I guess you can try:

for (i in 1:10) {
  df <- rbind(df, data.frame(x = i, y = toString(i)))
}

Note that in your code, there is one other problem:

  • You should use stringsAsFactors if you want the characters to not get converted to factors. Use: df = data.frame(x = numeric(), y = character(), stringsAsFactors = FALSE)


Answered By - A5C1D2H2I1M1N2O1R2T1
Answer Checked By - Marilyn (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing