PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Sunday, July 24, 2022

[FIXED] How to convert a JSON file to an SQLite database

 July 24, 2022     json, sqlite, sqlite-json1     No comments   

Issue

If I have some sample data, how do I put it into SQLite (preferably fully automated)?

{"uri":"/","user_agent":"example1"}
{"uri":"/foobar","user_agent":"example1"}
{"uri":"/","user_agent":"example2"}
{"uri":"/foobar","user_agent":"example3"}

Solution

I found the easiest way to do this is by using jq and CSV as an intermediary format.

Edit: As pointed out (thanks @Leo), the original question did show newline delimited JSON objects, which each on their own conform to rfc4627, but not all together in that format. jq can handle a single JSON array of objects much the same way though by preprocessing the file using jq '.[]' <input.json >preprocessed.json. If you happen to be dealing with JSON text sequences (rfc7464) luckily jq has got your back too with the --seq parameter.

Edit 2: Both the newline separated JSON and the JSON text sequences have one important advantage; they reduce memory requirements down to O(1), meaning your total memory requirement is only dependent on your longest line of input, whereas putting the entire input in a single array requires that either your parser can handle late errors (i.e. after the first 100k elements there's a syntax error), which generally isn't the case to my knowledge, or it will have to parse the entire file twice (first validating syntax, then parsing, in the process discarding previous elements, as is the case with jq --stream) which also happens rarely to my knowledge, or it will try to parse the whole input at once and return the result in one step (think of receiving a Python dict which contains the entirety of your say 50G input data plus overhead) which is usually memory backed, hence raising your memory footprint by just about your total data size.

Edit 3: If you hit any obstacles, try using keys_unsorted instead of keys. I haven't tested that myself (I kind of assume my columns were already sorted), however @Kyle Barron reports that this was needed.

Getting the CSV

First write your data to a file. I will assume data.json here.

Then construct the header using jq:

% head -1 data.json | jq -r 'keys | @csv'
"uri","user_agent"

The head -1 is because we only want one line. jq's -r makes the output a plain string instead of a JSON-String wrapping the CSV. We then call the internal function keys to get the keys of the input as an array. This we send to the @csv formatter which outputs us a single string with the headers in quoted CSV format.

We then need to construct the data.

% jq -r '[.[]] | @csv' < data.json
"/","example1"
"/foobar","example1"
"/","example2"
"/foobar","example3"

We now take the whole input and deconstruct the associative array (map) using .[] and then put it back into a simple array […]. This basically converts our dictionary to an array of keys. Sent to the @csv formatter, we again get some CSV.

Putting it all together we get a single one-liner in the form of:

% (head -1 data.json | jq -r 'keys | @csv' && jq -r '[.[]] | @csv' < data.json) > data.csv

If you need to convert the data on the fly, i.e. without a file, try this:

% cat data.json | (read -r first && jq -r '(keys | @csv),( [.[]] | @csv)' <<<"${first}" && jq -r '[.[]] | @csv')

Loading it into SQLite

Open an SQLite database:

sqlite3 somedb.sqlite

Now in the interactive shell do the following (assuming you wrote the CSV to data.csv and want it in a table called my_table):

.mode csv
.import data.csv my_table

Now close the shell and open it again for a clean environment. You can now easily SELECT from the database and do whatever you want to.

Putting it all together

Have an asciinema recording right there:

asciicast



Answered By - benaryorg
Answer Checked By - Marie Seifert (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing