PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Wednesday, August 3, 2022

[FIXED] How to scrape a table created using datawrapper using rvest?

 August 03, 2022     html-table, r, web-scraping     No comments   

Issue

I am trying to scrape Table 1 from the following website using rvest:

https://www.kff.org/coronavirus-covid-19/issue-brief/u-s-international-covid-19-vaccine-donations-tracker/

Following is the code i have written:

link <- "https://www.kff.org/coronavirus-covid-19/issue-brief/u-s-international-covid-19-vaccine-donations-tracker/"

page <- read_html(link)

page %>%   html_nodes("iframe") %>% html_attr("src") %>% .[11] %>% read_html() %>% 
  html_nodes("table.medium datawrapper-g2oKP-6idse1 svelte-1vspmnh resortable")

But, i get {xml_nodeset (0)} as the result. I am struggling to figure out the correct tag to select in html_nodes() from the datawrapper page to extract Table 1.

I will be really grateful if someone can point out the mistake i am making, or suggest a solution to scrape this table.

Many thanks.


Solution

The data is present in the iframe but needs a little manipulation. It is easier, for me at least, to construct the csv download url from the iframe page then request that csv

library(rvest)
library(magrittr)
library(vroom)
library(stringr)

page <- read_html('https://www.kff.org/coronavirus-covid-19/issue-brief/u-s-international-covid-19-vaccine-donations-tracker/')

iframe <- page %>% html_element('iframe[title^="Table 1"]') %>% html_attr('src')


id <- read_html(iframe) %>% html_element('meta') %>% html_attr('content') %>% str_match('/(\\d+)/') %>% .[, 2]

csv_url <- paste(iframe,id, 'dataset.csv', sep = '/' )

data <- vroom(csv_url, show_col_types = FALSE)


Answered By - QHarr
Answer Checked By - Senaida (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing