Issue
So I've got a .txt file that uses commas to separate fields, but it also uses pipes ("|") as text delimiters. I would like to read this .txt file using R (though I could use other programs if this is impossible with R), and I would like that all values would be in the right columns.
A sample of data:
15,|0370A01D-DC1E-4534-8176-A08A1E2F82E4|,|EDU|,|Education|,|Appropriations and authorization regarding higher education issues.|,|2008|
16,|03A8F7BB-9716-4494-BF41-013C27B5ECA6|,|GOV|,|Government Issues|,|issues affecting local government including appropriations|,|2003|
17,|04696109-082B-4EF6-9AA8-A6DB1013D15D|,|TEC|,|Telecommunications|,|RUS Broadband Applikcation|,|2008|
18,|04FA0BA7-E9D2-4F1E-8193-45F023065C89|,|DOC|,|District of Columbia|,|HUD Appropriations FY2009, CDBG
Financial Services Appropriations FY2009, District of Columbia
Commerce, Justice, Science Appropriations, Juvenile Justice, Byrne Grant|,|2008|
19,|04FA0BA7-E9D2-4F1E-8193-45F023065C89|,|HOU|,|Housing|,|HUD Appropriations FY2009, CDBG
Financial Services Appropriations FY2009, District of Columbia
Commerce, Justice, Science Appropriations, Juvenile Justice, Byrne Grant|,|2008|
So each row contains a row number (15, 16, ..., 19), a |uniqueID|, an |IssueID| of three letters, a longer version of |Issue|, a |SpecificIssue|, and a |Year|.
The closest I got to reading this file is by using the following code (I know that I identify pipe as a separator in it and it is incorrect, but this gives the best result thus far):
lob_issues2 <- fread("file.txt", sep = "|", fill = TRUE)
This results in the following table.
As you can see, the SpecificIssue column in rows 18 and 19 are causing trouble. Perhaps these values are too long or sth, and this makes R assign parts of these values in new columns. I would like that R would keep these values in the SpecificIssue column. Any suggestions on what code to use in order to achieve that?
Thanks in advance. Also, if you think another program is better for this, please let me know.
Solution
Use the quote=
argument to let it know that | is being used as the quote character:
lob_issues2 <- read.table("file.txt", quote = "|", sep = ",")
Answered By - G. Grothendieck Answer Checked By - Cary Denson (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.