Issue
I have some data that I read in and modify in R. For a minimal, reproducible example (reprex) I want to give the data as an "in R" representation to also communicate the data structure:
The code to read in the data:
paths <- sprintf("filenames%02d.out", 1:26)
interim <- lapply(paths, read.table, header=FALSE, sep="\t", dec=".", na.strings="NA")
new_col_name <- c("Pos", "LRTD")
out <- lapply(interim, setNames, nm = new_col_name)
Now, lapply(out, head)
allows us to see R's internal representation of the data:
[[1]]
Pos LRTD
1 0 0
2 70557 0
3 104076 0
4 163349 0
5 258229 0
6 356613 0
[[2]]
Pos LRTD
1 0 0
2 171603 0
3 268756 0
4 456513 0
5 594904 0
6 663581 0
[[3]]
Pos LRTD
1 0 0.000
2 171960 0.370
3 217096 0.358
4 254484 0.338
5 320866 0.366
6 432642 0.382
{...}
[[26]]
Pos LRTD
1 0 0
2 185161 0
3 234971 0
4 273218 0
5 319689 0
6 379800 0
So it is a list
of data.frame
s with 26 elements. Here, I want to call the numbers that we can see above in square brackets, so the numbers [[1]]
, [[2]]
, [[3]]
and so forth till [[26]]
, as "element descriptors".
Now what I would like to do is append a third column to all the data.frame
s in the list
where the column contains computationally read out structural information of the data.frame
s.
In detail, I would like to add the element descriptors of the given data.frame
s to their respective data.frame
. That in mind, the result should look like this:
[[1]]
Pos LRTD Chr
1 0 0 1
2 70557 0 1
[[2]]
Pos LRTD Chr
1 0 0 2
2 171603 0 2
[[3]]
Pos LRTD Chr
1 0 0.000 3
2 171960 0.370 3
{...}
[[26]]
Pos LRTD Chr
1 0 0 26
2 185161 0 26
Since I am well aware of this question, my current solution is pseudocode:
lapply(out, function(x) { x$Chr <- rep("element descriptor","lenght of list");return(x)})
I know that I can get the length of the respective data.frame
with rapply(out, length)
, but so far I don't get rapply
to work within my lapply
command from above.
Also, how to reference the element descriptor in code?
Solution
Map
works well for this.
Map(function(x, ind) transform(x, Chr = ind), out, seq_along(out))
# [[1]]
# Pos LRTD Chr
# 1 0 0 1
# 2 70557 0 1
# 3 104076 0 1
# 4 163349 0 1
# 5 258229 0 1
# 6 356613 0 1
# [[2]]
# Pos LRTD Chr
# 1 0 0 2
# 2 171603 0 2
# 3 268756 0 2
# 4 456513 0 2
# 5 594904 0 2
# 6 663581 0 2
# [[3]]
# Pos LRTD Chr
# 1 0 0.000 3
# 2 171960 0.370 3
# 3 217096 0.358 3
# 4 254484 0.338 3
# 5 320866 0.366 3
# 6 432642 0.382 3
# [[4]]
# Pos LRTD Chr
# 1 0 0 4
# 2 185161 0 4
# 3 234971 0 4
# 4 273218 0 4
# 5 319689 0 4
# 6 379800 0 4
If your "element descriptors" are really names, then replace that with
Map(function(x, ind) transform(x, Chr = ind), out, names(out))
and it will do effectively the same thing.
If you're comfortable with lapply
and want to know how this compares with that, then the equivalent lapply
for that Map
would be:
lapply(names(out), function(nm) transform(out[[nm]], Chr = nm))
You can even code-golf it a bit, with
Map(transform, out, Chr = seq_along(out))
Map(transform, out, Chr = names(out))
(both identical output to above). This happens to work because we can use named arguments in Map
that are passed-through to the f=
(function) argument, transform
in this case.
Data:
out <- list(structure(list(Pos = c(0L, 70557L, 104076L, 163349L, 258229L, 356613L), LRTD = c(0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")), structure(list(Pos = c(0L, 171603L, 268756L, 456513L, 594904L, 663581L), LRTD = c(0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")), structure(list(Pos = c(0L, 171960L, 217096L, 254484L, 320866L, 432642L), LRTD = c(0, 0.37, 0.358, 0.338, 0.366, 0.382)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")), structure(list(Pos = c(0L, 185161L, 234971L, 273218L, 319689L, 379800L), LRTD = c(0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6")))
Answered By - r2evans Answer Checked By - Timothy Miller (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.