PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Friday, May 13, 2022

[FIXED] How to append a column to data.frames in a list where the column shall contain computationally read out structural information of those data.frames?

 May 13, 2022     append, dataframe, lapply, r     No comments   

Issue

I have some data that I read in and modify in R. For a minimal, reproducible example (reprex) I want to give the data as an "in R" representation to also communicate the data structure:

The code to read in the data:

paths <- sprintf("filenames%02d.out", 1:26)
interim <- lapply(paths, read.table, header=FALSE, sep="\t", dec=".", na.strings="NA")
new_col_name <- c("Pos", "LRTD")
out <- lapply(interim, setNames, nm = new_col_name)

Now, lapply(out, head) allows us to see R's internal representation of the data:

[[1]]
     Pos LRTD
1      0    0
2  70557    0
3 104076    0
4 163349    0
5 258229    0
6 356613    0

[[2]]
     Pos LRTD
1      0    0
2 171603    0
3 268756    0
4 456513    0
5 594904    0
6 663581    0

[[3]]
     Pos  LRTD
1      0 0.000
2 171960 0.370
3 217096 0.358
4 254484 0.338
5 320866 0.366
6 432642 0.382

{...}

[[26]]
     Pos LRTD
1      0    0
2 185161    0
3 234971    0
4 273218    0
5 319689    0
6 379800    0

So it is a list of data.frames with 26 elements. Here, I want to call the numbers that we can see above in square brackets, so the numbers [[1]], [[2]], [[3]] and so forth till [[26]], as "element descriptors".

Now what I would like to do is append a third column to all the data.frames in the list where the column contains computationally read out structural information of the data.frames. In detail, I would like to add the element descriptors of the given data.frames to their respective data.frame. That in mind, the result should look like this:

[[1]]
     Pos LRTD   Chr
1      0    0   1
2  70557    0   1

[[2]]
     Pos LRTD   Chr
1      0    0   2
2 171603    0   2

[[3]]
     Pos  LRTD   Chr
1      0 0.000   3
2 171960 0.370   3

{...}

[[26]]
     Pos LRTD   Chr
1      0    0   26
2 185161    0   26

Since I am well aware of this question, my current solution is pseudocode:

lapply(out, function(x) { x$Chr <- rep("element descriptor","lenght of list");return(x)})

I know that I can get the length of the respective data.frame with rapply(out, length), but so far I don't get rapply to work within my lapply command from above.

Also, how to reference the element descriptor in code?


Solution

Map works well for this.

Map(function(x, ind) transform(x, Chr = ind), out, seq_along(out))
# [[1]]
#      Pos LRTD Chr
# 1      0    0   1
# 2  70557    0   1
# 3 104076    0   1
# 4 163349    0   1
# 5 258229    0   1
# 6 356613    0   1
# [[2]]
#      Pos LRTD Chr
# 1      0    0   2
# 2 171603    0   2
# 3 268756    0   2
# 4 456513    0   2
# 5 594904    0   2
# 6 663581    0   2
# [[3]]
#      Pos  LRTD Chr
# 1      0 0.000   3
# 2 171960 0.370   3
# 3 217096 0.358   3
# 4 254484 0.338   3
# 5 320866 0.366   3
# 6 432642 0.382   3
# [[4]]
#      Pos LRTD Chr
# 1      0    0   4
# 2 185161    0   4
# 3 234971    0   4
# 4 273218    0   4
# 5 319689    0   4
# 6 379800    0   4

If your "element descriptors" are really names, then replace that with

Map(function(x, ind) transform(x, Chr = ind), out, names(out))

and it will do effectively the same thing.

If you're comfortable with lapply and want to know how this compares with that, then the equivalent lapply for that Map would be:

lapply(names(out), function(nm) transform(out[[nm]], Chr = nm))

You can even code-golf it a bit, with

Map(transform, out, Chr = seq_along(out))
Map(transform, out, Chr = names(out))

(both identical output to above). This happens to work because we can use named arguments in Map that are passed-through to the f= (function) argument, transform in this case.


Data:

out <- list(structure(list(Pos = c(0L, 70557L, 104076L, 163349L, 258229L, 356613L), LRTD = c(0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")), structure(list(Pos = c(0L, 171603L, 268756L, 456513L, 594904L, 663581L), LRTD = c(0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")), structure(list(Pos = c(0L, 171960L, 217096L, 254484L, 320866L, 432642L), LRTD = c(0, 0.37, 0.358, 0.338, 0.366, 0.382)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")), structure(list(Pos = c(0L, 185161L, 234971L, 273218L, 319689L, 379800L), LRTD = c(0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")))


Answered By - r2evans
Answer Checked By - Timothy Miller (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing