jq Joy: Converting JSON to CSV
As mentioned in the previous blog posts from MrHaki jq
is a powerful tool to work with JSON from the command-line.
The power of jq
can also be leveraged to extract data from a JSON structure and export it to CSV.
To illustrate this, we use the public Open Library Api found in this extensive overview of Public APIs.
First we’ll download an overview of some books with Java as subject and store this the file java-books.json
curl -o java-books.json https://openlibrary.org/subjects/java.json?details=true
The file contains a lot of information about the books as illustrated by the redacted snippet below.
...
{
...
"works": [
{
"title": "Java software solutions",
"edition_count": 26,
"authors": [
{
"key": "/authors/OL32216A",
"name": "Lewis, John"
},
{
"key": "/authors/OL281404A",
"name": "John E. Lewis Ph. D."
},
{
"key": "/authors/OL2670541A",
"name": "William Loftus"
}
],
"first_publish_year": 1997,
"availability": {
"available_to_browse": true,
"isbn": "0321322037"
}
}
]
...
}
...
We’d like to extract the title, ISBN, edition count, publish year and authors data and export it as CSV.
To do this, we will flatten the structure of the JSON file by selecting the list of books and then specifying the fields from the selected items.
cat java-books.json | jq '.works[] | [ .title, .availability.isbn, .edition_count, .first_publish_year, .authors[0].name, .authors[1].name, .authors[2].name]'
The final step is to tell jq
to export the results as CSV using the @csv
filter. We also have to tell jq
that we want the raw results by passing the -r
flag and direct the output to result.csv
.
This results in the following command:
cat java-books.json | jq -r '.works[] | [ .title, .availability.isbn, .edition_count, .first_publish_year, .authors[0].name, .authors[1].name, .authors[2].name] | @csv' > result.csv
If you’d like to play around with jq
and its filters online or share examples, have a look at jq play.
A version related to this blog post can be found here.
Written with jq 1.7.