Issue
I have two dataframes data1 and data2 which have information like below:
dput(data1)
structure(list(ProfName = c("Hua (Christine) Xin", "Dereck Barr-Pulliam",
"Lisa M. Blum", "Russell Williamson", "William D. Stout", "Michael F. Wade",
"Sheila A. Johnston", "Julie Huang", "Alan Attaway", "Alan Levitan",
"Benjamin P. Foster", "Carolyn M. Callahan"), Title = c(" PhD",
" PhD", " LLM", " PhD", " PhD", " CPA", " MS", " PhD", " PhD",
" PhD", " PhD", " PhD"), Profession = c("Assistant Professor",
"Assistant Professor", "Instructor", "Assistant Professor", "Associate Professor and Director",
"Instructor", "Instructor", "Associate Professor", "Professor",
"Professor", "Professor", "Brown-Forman Professor of Accountancy"
)), row.names = c(8L, 18L, 25L, 36L, 49L, 50L, 56L, 69L, 71L,
82L, 88L, 89L), class = "data.frame")
It looks like below:
dput(data2)
structure(list(ProfName = c("Blandford, K ", "Okafor, A ",
"Johnston, S ", "Rolen, R ", "Attaway, A ", "Xin, H ",
"Huang, Y ", "Stout, W ", "Williamson, R ", "Callahan, C ",
"Foster, B ", "Blum, L ", "Levitan, A ", "Barr-Pulliam, D ",
"Wade, M ")), row.names = c(NA, -15L), class = "data.frame")
data2 looks like below:
I wanted to merge two dataframes, but the names look different. Only a specific string is matching between two dataframes with column ProfName. The data should be merged and if the names don't have any information it should be empty. If they don't have any information in the columns Title and Profession, both ProfName and New columns should have the same name.
I tried using merge, but it didn't give the desired output.
merge(data1, data2, by="ProfName", all.x=TRUE, all.y = TRUE)
The output should look like below:
Solution
Here's a simple solution:
library(stringr)
library(dplyr)
library(tidyr)
library(magrittr)
data1 %<>% mutate(lname = str_extract(ProfName, "[A-Za-z\\-]+$"))
data2 %<>% mutate(lname = str_extract(ProfName, "^[A-Za-z\\-]+"))
df <- merge(data1, data2, all.y = TRUE, by = "lname")
head(df)
# lname ProfName.x Title Profession # ProfName.y
# 1 Attaway Alan Attaway PhD Professor Attaway, A
# 2 Barr-Pulliam Dereck Barr-Pulliam PhD Assistant Professor Barr-Pulliam, D
# 3 Blandford <NA> <NA> <NA> Blandford, K
# 4 Blum Lisa M. Blum LLM Instructor Blum, L
# 5 Callahan Carolyn M. Callahan PhD Brown-Forman Professor of Accountancy Callahan, C
# 6 Foster Benjamin P. Foster PhD Professor Foster, B
Answered By - Dunois Answer Checked By - Pedro (PHPFixing Volunteer)



0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.