paulgalow.com

How to work with JSON API data in macOS shell scripts without external dependencies

November 24, 2021

tl;dr

You might want to try out this shell snippet to retrieve a value from HTTP API JSON responses.

⚠️ Please note the use of JSON.parse() to avoid potential remote code execution:

#!/bin/sh

getJsonValue() {
  # $1: JSON string to parse, $2: JSON key to look up
  osascript -l 'JavaScript' -e "JSON.parse(\`$1\`).$2;"
}

data=$(curl -sS '<http-api-url>')

myValue=$(getJsonValue "$data" "<json-key>")

echo "$myValue"

As we are progressing into a world of Cloud-based services connected by APIs, I’m finding more and more situations where I need to work with JSON data. One use case is using data from HTTP APIs for managing macOS clients. However, working with JSON data in shell scripts is usually cumbersome without using a third-party dependency like jq.

Don’t get me wrong, jq is a great tool but it is an external dependency that needs to be deployed and managed if I am going to use it in a script for a fleet of Macs. And since the pre-installed scripting languages such as Python and Ruby seem to be on their way out of the default macOS installation, I’m not inclined to rely on those tools being present moving forward.

Our goal: A shell snippet to easily and safely retrieve JSON values

And this snippet should not rely on external dependencies. This is where Apple’s JavaScript for Automation (JXA) comes into play. I recommend checking out Matthew Warren’s blog post on JSON parsing without relying on third-party dependencies as well as Armin Briegel’s The unexpected return of JavaScript for Automation for more information about JXA.

At the most basic level, all we are going to do is parse JSON data and return the key’s value we are interested in. Since a shell script is our main execution context, we are handing off as much control as possible to our shell environment by passing and expanding shell variables into a JavaScript expression. We can then evaluate this expression using osascript -l 'JavaScript'.

So a minimal implementation would look something like this:

# ⚠️ Warning: This code snippet is not safe to use!
osascript -l 'JavaScript' -e "($1.$2);"

Here, all we do is implicitly return a key’s value (defined in $2) of whatever variable has been passed into the JavaScript execution context by our shell runtime ($1). We don’t even need to use the return keyword or JXA’s special run() function. JavaScript is flexible! Great. But there is a catch …

👾 Remote code execution vulnerability

JavaScript comes with a global built-in object called JSON which has a method called JSON.parse() that is designed to parse JSON strings.

So why would we need to use that if we can pass in JSON strings directly into our JavaScript execution context? One answer to that is security.

Turns out by not using JSON.parse() we are enabling a potential backdoor for a remote code execution attack, because without it, any JavaScript code injected will be executed, not just JSON. Let me illustrate with a simple example:

Suppose we are calling an external API to get a user’s full name:

# ⚠️ Warning: This code snippet is not safe to use!

getJsonValue() {
  # $1: JSON string to process, $2: Desired JSON key
  osascript -l 'JavaScript' -e "($1.$2);"
}

data=$(curl -sS https://jsonplaceholder.typicode.com/users/1)
fullName=$(getJsonValue "$data" name)

echo "$fullName" # Returns: 'Leanne Graham'

⚔️ The attack

So far, so good. Now, let’s simulate a scenario in which an attacker has managed to intercept and replace our JSON response data with some carefully crafted JavaScript code:

# ⚠️ Warning: This code snippet is not safe to use!

getJsonValue() {
  # $1: JSON string to process, $2: Desired JSON key
  osascript -l 'JavaScript' -e "($1.$2);"
}

json=$(curl -sS https://jsonplaceholder.typicode.com/users/1)

# String sent by a rogue HTTP endpoint
json='{
 get name() {
   const app = Application.currentApplication();
   app.includeStandardAdditions = true;
   app.displayAlert("👾🔥🙈");
 }
}'

fullName=$(getJsonValue "$json" name)
echo "$fullName" # Custom JavaScript code is being executed instead 👾

In this scenario, our attacker would have the opportunity to execute any arbitrary JavaScript code supported by JXA. Here, all they are doing is displaying an alert, but you can easily imagine more nefarious purposes.

I have recorded a short video for illustration purposes.

🛡️ Remediation: Validating our JSON input

So how can we protect against this threat? That’s where JSON.parse() comes into play. Using it, we make sure to parse valid JSON only. If the incoming string is not JSON, this method call will throw an error.

But there is more: Since the value of our first expanded shell variable $1 will likely be a multi-line string (which is very common when working with JSON), we need to make sure we can handle those. JavaScript provides a special syntax using backticks (`) to create template literals. Using those, we can expand multi-line strings into our JXA execution context without it throwing an error.

But aren’t backticks special characters in shell scripts? Indeed. So we need to escape them (using backslashes) to ensure our shell runtime does not evaluate them as shell expressions.

Let’s try our attack again, but this time we will be using JSON.parse() to validate our input:

getJsonValue() {
  # $1: JSON string to process, $2: Desired JSON key
  osascript -l 'JavaScript' -e "($1.$2);"
  osascript -l 'JavaScript' -e "JSON.parse(\`$1\`).$2;"
}

# json=$(curl -sS https://jsonplaceholder.typicode.com/users/1)

# String sent by a rogue API endpoint
json='{
  get name() {
    const app = Application.currentApplication();
    app.includeStandardAdditions = true;
    app.displayAlert("👾🔥🙈");
  }
}'

fullName=$(getJsonValue "$json" name)
echo "$fullName" # Returns a JSON Parse error

We have successfully thwarted the attack. Here is the error returned:

execution error: Error: SyntaxError: JSON Parse error: Expected '}' (-2700)

Working with real-world HTTP JSON responses

Our example earlier was using a simple JSON object to illustrate the basic principle. However, in the real world, we typically encounter JSON responses that are more nested and sometimes convoluted. Let’s use Apple’s iTunes Search API as an example.

Suppose we are looking for the release date of Steely Dan’s excellent album Pretzel Logic. Using Apple’s API, we might use the following HTTP call:

curl 'https://itunes.apple.com/search?term=steely+dan+pretzel+logic&entity=album'

And the API would respond with something like this (if nicely formatted):

{
  "resultCount": 1,
  "results": [
    {
      "wrapperType": "collection",
      "collectionType": "Album",
      "artistId": 59606,
      "collectionId": 1440489152,
      "amgArtistId": 5525,
      "artistName": "Steely Dan",
      "collectionName": "Pretzel Logic",
      "collectionCensoredName": "Pretzel Logic",
      "artistViewUrl": "https://music.apple.com/us/artist/steely-dan/59606?uo=4",
      "collectionViewUrl": "https://music.apple.com/us/album/pretzel-logic/1440489152?uo=4",
      "artworkUrl60": "https://is3-ssl.mzstatic.com/image/thumb/Music115/v4/cf/3f/c7/cf3fc739-3b30-2502-5963-200116007cd8/source/60x60bb.jpg",
      "artworkUrl100": "https://is3-ssl.mzstatic.com/image/thumb/Music115/v4/cf/3f/c7/cf3fc739-3b30-2502-5963-200116007cd8/source/100x100bb.jpg",
      "collectionPrice": 8.99,
      "collectionExplicitness": "notExplicit",
      "trackCount": 11,
      "copyright": "℗ 1974 UMG Recordings, Inc.",
      "country": "USA",
      "currency": "USD",
      "releaseDate": "1974-01-01T08:00:00Z",
      "primaryGenreName": "Rock"
    }
  ]
}

As we can see, the JSON response contains a results key which has an array for its value, that itself contains a single object. That object then has our releaseDate key that leads us to the value we are looking for.

So how can we get the value for releaseDate? Let’s have a look:

getJsonValue() {
  # $1: JSON string to process, $2: Desired JSON key
  osascript -l 'JavaScript' -e "JSON.parse(\`$1\`).$2;"
}

data=$(curl -sS 'https://itunes.apple.com/search?term=steely+dan+pretzel+logic&entity=album')

releaseDate=$(getJsonValue "$data" "results[0].releaseDate")
echo "$releaseDate" # Returns: '1974-01-01T08:00:00Z'

Here, we are targeting the results key’s first and only object entry by using JavaScript’s bracket notation [0]. Using dot notation, we can then drill down into this object to retrieve the value for its releaseDate key.

Bonus: JSON parsing using jsc

Craig Hockenberry recently wrote about a built-in WebKit command-line tool that is part of the native JavaScript framework used by Safari and other WebKit consumers. Let’s see if we can use this tool for our purposes as well. Here’s our shell function implemented using jsc.

getJsonValue() {
  # $1: JSON string to process, $2: Desired JSON key
  # Will return 'undefined' if key cannot be found
  /System/Library/Frameworks/JavaScriptCore.framework/Versions/Current/Helpers/jsc \
  -e "print(JSON.parse(\`$1\`).$2);"
}

data=$(curl -sS 'https://itunes.apple.com/search?term=steely+dan+pretzel+logic&entity=album')

releaseDate=$(getJsonValue "$data" "results[0].releaseDate")
echo "$releaseDate" # Returns: '1974-01-01T08:00:00Z'

Turns out we can! Be aware that jsc will return undefined if a key cannot be found.

Caveat: I would hesitate to rely on this tool in production scripts since we are using a command-line tool somewhat hidden inside an Apple System Framework. Knowing Apple, APIs like that might change unexpectedly. So keep that in mind!

Have I missed anything? If so, please let me know!


Paul Galow

Hi, I'm Paul Galow, and I'm an engineer based in Berlin working in DevOps, macOS administration and video engineering.
GitHubTwitterLinkedIn