Mastering AWS CLI Output Filtering with JQ: Advanced Techniques
Use jq with AWS CLI JSON output to filter, reshape, and export cloud data for scripts and reports.
Mastering AWS CLI Output Filtering with JQ: Advanced Techniques
AWS CLI output filtering gets messy when a command returns deeply nested JSON. You may only need one instance ID, one tag value, or a short CSV report, but commands like aws ec2 describe-instances return far more data than that.
The AWS CLI has a built-in --query option that uses JMESPath, and it is often the right tool for simple lookups. jq is useful when you want richer JSON reshaping, conditional logic, CSV output, or filters you can reuse in shell scripts.
Start with JSON Output
jq reads JSON, so make the AWS CLI return JSON explicitly:
aws ec2 describe-instances --output json
You can also set JSON as your configured default with aws configure, but using --output json in examples and scripts makes the dependency obvious.
Install jq with your system package manager if it is not already available:
sudo apt update && sudo apt install jq
sudo dnf install jq
brew install jq
Read and Inspect AWS JSON
The identity filter, ., returns the input JSON. It is handy when you want pretty-printed output while learning the shape of a response:
aws ec2 describe-instances --output json | jq '.'
To select a top-level key, use dot notation:
aws ec2 describe-instances --output json | jq '.Reservations'
Most AWS commands wrap useful data in arrays. For EC2 instances, the structure is Reservations[] followed by Instances[], so you usually need both levels.
Iterate Through Arrays
Use .[] to emit each item from an array. This command prints every EC2 instance ID across all reservations:
aws ec2 describe-instances --output json | jq '.Reservations[].Instances[].InstanceId'
By default, string output includes JSON quotes. Add -r when you need raw text for a shell loop or another command:
aws ec2 describe-instances --output json | jq -r '.Reservations[].Instances[].InstanceId'
You can also build a smaller object with only the fields you care about:
aws ec2 describe-instances --output json |
jq '.Reservations[].Instances[] | {id: .InstanceId, state: .State.Name, type: .InstanceType}'
That gives you compact JSON records instead of the full EC2 response.
Filter with select()
Use select() when you only want records that match a condition. This example lists running instance IDs:
aws ec2 describe-instances --output json |
jq -r '.Reservations[].Instances[] | select(.State.Name == "running") | .InstanceId'
String literals inside a jq filter use double quotes, such as "running". If you wrap the whole jq program in single quotes, your shell will pass those double quotes through safely.
For a more specific check, filter by instance type:
aws ec2 describe-instances --output json |
jq -r '.Reservations[].Instances[] | select(.InstanceType == "t3.micro") | .InstanceId'
Handle Optional Fields Safely
AWS resources often omit fields. Tags are a common example. If an instance has no Tags array, .Tags[] can fail. Use the optional iterator .Tags[]?:
aws ec2 describe-instances --output json |
jq '.Reservations[].Instances[] | {id: .InstanceId, name: (.Tags[]? | select(.Key == "Name") | .Value)}'
That works, but it can produce no name value when the tag is missing. For scripts, a default is usually easier to consume:
aws ec2 describe-instances --output json |
jq '.Reservations[].Instances[] | {
id: .InstanceId,
name: ((.Tags[]? | select(.Key == "Name") | .Value) // "unnamed")
}'
Return an Array Instead of a Stream
Many jq filters emit a stream of values. If another tool expects one valid JSON array, wrap the expression in square brackets:
aws ec2 describe-instances --output json |
jq '[.Reservations[].Instances[].InstanceId]'
This is useful when you are writing a JSON file for another automation step.
Export CSV or TSV
For spreadsheet-friendly reports, build an array of fields and pass it to @csv or @tsv. Use -r so jq writes raw CSV lines rather than JSON strings:
aws ec2 describe-instances --output json |
jq -r '.Reservations[].Instances[] | [.InstanceId, .InstanceType, .State.Name] | @csv'
To include a header row, emit it before the data rows:
aws ec2 describe-instances --output json |
jq -r '["instance_id","instance_type","state"], (.Reservations[].Instances[] | [.InstanceId, .InstanceType, .State.Name]) | @csv'
Practical Example: Find Unattached Elastic IPs
Unattached Elastic IP addresses can create avoidable cost. This command lists public IPs where AWS did not return an AssociationId:
aws ec2 describe-addresses --output json |
jq -r '.Addresses[] | select(.AssociationId == null) | .PublicIp'
If it prints addresses, review them before release:
aws ec2 release-address --allocation-id eipalloc-0123456789abcdef0
Do not pipe release commands directly from a discovery query until you have checked the results. A short review step is cheaper than recovering from deleting the wrong resource.
When to Use --query Instead
Use AWS CLI --query for simple projections because it runs before output formatting and keeps the command self-contained:
aws ec2 describe-instances \
--query 'Reservations[].Instances[].InstanceId' \
--output text
Reach for jq when the result needs more transformation, such as fallback values, CSV formatting, combining fields, or longer filters that are easier to read in jq syntax.
Takeaway
Use --output json as the handoff point between AWS CLI and jq. Then combine .[], select(), object construction, // defaults, and -r raw output to turn large AWS responses into the exact data your script or report needs.