PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Friday, August 26, 2022

[FIXED] how to parse a text file that contains the column names at the beginning of the file?

 August 26, 2022     csv, r     No comments   

Issue

My text file looks like the following

"
file1
cols=
col1
col2
# this is a comment
col3

data
a,b,c
d,e,f
"

As you can see, the data only starts after the data tag and the rows before that essentially tell me what the column names are. There could be some comments which means the number of rows before the data tag is variable.

How can I parse that in R? Possibly with some tidy tools? Expected output is:

# A tibble: 2 x 3
  col1  col2  col3 
  <chr> <chr> <chr>
1 a     b     c    
2 d     e     f  

Thanks!


Solution

Here is a base way with scan(). strip.white = T to remove blank lines and comment.char = "#" to remove lines leading with #.

text <- scan("test.txt", "", sep = "\n", strip.white = T, comment.char = "#")
text
# [1] "file1" "cols=" "col1"  "col2"  "col3"  "data"  "a,b,c" "d,e,f"

ind1 <- which(text == "cols=")
ind2 <- which(text == "data")
df <- read.table(text = paste(text[-seq(ind2)], collapse = "\n"),
                 sep = ",", col.names = text[(ind1 + 1):(ind2 - 1)])

df
#   col1 col2 col3
# 1    a    b    c
# 2    d    e    f


Answered By - Darren Tsai
Answer Checked By - David Goodson (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing