Command line JSON with jq

Riley Tomasek

jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text.

I regret not learning how to use jq earlier. If you frequently deal with JSON, it will save you a lot of time.

This post is both a practical introduction for you, and a reminder for future me. The jq syntax can be terse.

Loading Data

  • From a file: jq '.' data.json
  • From a string: echo '{"person": { "name": "Riley" }}' | jq '.'
  • From curl: https://api.github.com/repos/rileytomasek/zodix/commits

Pretty Printing JSON

The jq '.' syntax will pretty print the JSON.

echo '{"person": { "name": "Riley" }}' | jq '.'
❯ {
❯   "person": {
❯     "name": "Riley"
❯   }
❯ } 

Which is great for saving API requests to a file:

echo '{"person": { "name": "Riley" }}' | jq '.' > data.json

Accessing Properties

Getting property values is straightforward:

echo '{"person": { "name": "Riley" }}' | jq '.person'
❯ {
❯   "name": "Riley"
❯ }

echo '{"person": { "name": "Riley" }}' | jq '.person.name'
❯ "Riley"

Arrays

This is the part I struggled with the most, and likely the most important for real world use cases.

echo '[1,2,3]' | jq '.[]'
❯ 1
❯ 2
❯ 3

echo '{"people": ["Bill", "Bob"]}' | jq '.people[]'
❯ "Bill"
❯ "Bob"

I like to think of the .[] notation as turning the array into a list. Once you have a list, you can do things like select values, filter, and map.

Assume we're working with this JSON:

{
  "locations": [
    { "city": "New York", "state": "New York", "country": "USA" },
    { "city": "Miami", "state": "Florida", "country": "USA" },
    { "city": "Vancouver", "state": "BC", "country": "Canada" }
  ]  
}

Select the name of each city:

jq '.locations[].city' data.json
❯ "New York"
❯ "Miami"
❯ "Vancouver"

Select the name of each city in USA:

jq '.locations[] | select(.country=="USA").city' data.json
❯ "New York"
❯ "Miami"

Select a city by index:

jq '.locations[0]' data.json
❯ {
❯   "city": "New York",
❯   "state": "New York",
❯   "country": "USA"
❯ }

Select a slice of cities:

jq '.locations[0:2]' data.json
❯ [
❯   {
❯     "city": "New York",
❯     "state": "New York",
❯     "country": "USA"
❯   },
❯   {
❯     "city": "Miami",
❯     "state": "Florida",
❯     "country": "USA"
❯   }
❯ ]

It's important to note that the slice above returned an array, not a list, so you can't do this:

jq '.locations[0:2].city' data.json
❯ jq: error (at data.json:7): Cannot index array with string "city"

But it's easy to fix. Just use .[] to turn the array into a list:

jq '.locations[0:2] | .[].city' data.json
❯ "New York"
❯ "Miami"

To convert the list back to valid JSON, just wrap .[].city in an array like [.[].city].

jq '.locations[0:2] | [.[].city]' data.json
❯ [
❯   "New York",
❯   "Miami"
❯ ]

At this point, you could save the data you selected as a valid JSON file.

jq '.locations[0:2] | [.[].city]' data.json > cities.json

Finding Unique Values

To get a list of the unique countries:

  • Get a list of the country names .locations[].country
  • Turn the list into an array [.locations[].country]
  • Call | unique on the array
jq '[.locations[].country] | unique' data.json
❯ [
❯  "Canada",
❯   "USA"
❯ ]

You could take it one step further and get the number of unique countries like this:

jq '[.locations[].country] | unique | length' data.json
❯ 2

Transforming Values

This is the real magic of jq for me. Let's take the location JSON and apply a few realistic transformations.

{
  "locations": [
    { "city": "New York", "state": "New York", "country": "USA" },
    { "city": "Miami", "state": "Florida", "country": "USA" },
    { "city": "Vancouver", "state": "BC", "country": "Canada" }
  ]  
}

Use string concatenation to add a full name for each city:

jq '[.locations[]] | map(. + {
  name: (.city + ", " + .state + ", " + .country)
})' data.json
❯ [
❯   {
❯     "name": "New York, New York, USA",
❯     "country": "USA",
❯     "city": "New York",
❯     "state": "New York"
❯   },
❯   {
❯     "name": "Miami, Florida, USA",
❯     "country": "USA",
❯     "city": "Miami",
❯     "state": "Florida"
❯   },
❯   {
❯     "name": "Vancouver, BC, Canada",
❯     "country": "Canada",
❯     "city": "Vancouver",
❯     "state": "BC"
❯   }
❯ ]

Basic Analysis

Using jq can be the quickest way to run basic analysis on JSON:

jq '[.locations[]] | {
  count: . | length,
  numCountries: [.[].country] | unique | length,
  countries: [.[].country] | unique,
}' data.json
❯ {
❯   "count": 3,
❯   "numCountries": 2,
❯   "countries": [
❯     "Canada",
❯     "USA"
❯   ]
❯ }

Working with JSONL

jq can read and write JSON Lines (.jsonl), which is commonly used in machine learning.

To read JSONL, use the -s flag to slurp each row into an array:

echo '{"count": 1}\n{"count": 2}' | jq -s '.'
❯ [
❯   {
❯   "count": 1
❯   },
❯   {
❯     "count": 2
❯   }
❯ ]

To output JSONL, use the command line flag -c:

echo '{"count": 1}\n{"count": 2}' | \
  jq -s '.[] | { "count": (.count * 2) }' | \
  jq -c
❯ {"count":2}
❯ {"count":4}

Conclusion

While somewhat confusing to learn and remember, jq is a powerful tool for quick JSON analysis and transformations from the command line.