What it is
A quick reference for YAML syntax, commonly used for configuration files and data serialization, to help you write and validate YAML documents.
Installation
YAML is a human-readable data serialization standard, not a command-line tool that needs installation. You’ll typically use a text editor to write YAML files. However, you might use tools like yamllint for validation or yq for processing YAML from the command line.
For yamllint (validation):
- Linux (Debian/Ubuntu):
sudo apt update && sudo apt install yamllint - Linux (Fedora):
sudo dnf install yamllint - macOS:
brew install yamllint - Windows (using Chocolatey):
choco install yamllint
For yq (processing):
- Linux (using curl):
curl -s "https://api.github.com/repos/mikefarah/yq/releases/latest" | grep "browser_download_url.*linux_amd64" | cut -d '"' -f 4 | wget -O yq -q chmod +x yq sudo mv yq /usr/local/bin/ - macOS (using Homebrew):
brew install yq - Windows (using Chocolatey):
choco install yq
Core Concepts
- Indentation: YAML uses spaces (not tabs) for indentation to define structure. Consistent indentation is crucial. Typically, 2 spaces per level is recommended.
- Key-Value Pairs: The fundamental building block. A key followed by a colon and a space, then the value.
name: John Doe age: 30 - Sequences (Lists/Arrays): Represented by items starting with a hyphen (
-) and a space.fruits: - apple - banana - cherry - Mappings (Dictionaries/Objects): Collections of key-value pairs. Can be nested.
person: name: Jane Smith address: street: 123 Main St city: Anytown - Scalars: Primitive data types like strings, numbers, booleans, and null.
- Strings: Can be quoted or unquoted. Unquoted strings are preferred when they don’t contain special characters or start with reserved YAML characters.
message: Hello, world! quoted_message: "This is a quoted string." multi_line: | This is the first line. This is the second line. folded_line: > This is a long line that will be folded into a single line with spaces. - Numbers: Integers and floating-point numbers are represented directly.
count: 100 price: 19.99 - Booleans:
trueorfalse(case-insensitive in older versions, buttrue/falseis standard).enabled: true active: false - Null:
nullor~represents a null value.optional_field: null empty_value: ~
- Strings: Can be quoted or unquoted. Unquoted strings are preferred when they don’t contain special characters or start with reserved YAML characters.
- Comments: Start with a hash symbol (
#).# This is a comment setting: value - Document Separator:
---separates multiple YAML documents within a single file. - Document End Marker:
...can be used to explicitly end a YAML document.
Commands / Usage
Using yamllint (Validation):
-
Validate a single file:
yamllint config.yamlExplanation: Checks
config.yamlfor syntax errors and style issues based on its default rules or a specified configuration. -
Validate multiple files:
yamllint file1.yaml file2.yamlExplanation: Validates both
file1.yamlandfile2.yaml. -
Specify a configuration file:
yamllint -c .yamllint.yml config.yamlExplanation: Uses the rules defined in
.yamllint.ymlto validateconfig.yaml. -
Explain a rule:
yamllint --explain rule-nameExample:
yamllint --explain indentationExplanation: Provides a detailed description of the
indentationrule.
Using yq (Processing - mikefarah/yq version):
-
Read a value from a YAML file:
yq '.users[0].name' users.yamlExplanation: Extracts the value of the
namekey from the first object in theuserssequence inusers.yaml. -
Read a value with a specific key:
yq '.database.port' config.yamlExplanation: Extracts the value of the
portkey nested underdatabaseinconfig.yaml. -
Update a value in a YAML file:
yq -i '.database.host = "localhost"' config.yamlExplanation: Updates the
hostkey underdatabaseto"localhost"directly withinconfig.yaml(in-place edit). -
Add a new key-value pair:
yq -i '.server.timeout = 60' config.yamlExplanation: Adds a new key
timeoutwith value60under theservermapping inconfig.yaml. Ifserverdoesn’t exist, it will be created. -
Append to a sequence:
yq -i '.modules += "new_module"' config.yamlExplanation: Appends the string
"new_module"to the sequence namedmodulesinconfig.yaml. -
Create a new sequence element:
yq -i '.users += {"name": "Charlie", "id": 3}' users.yamlExplanation: Appends a new dictionary object to the
userssequence inusers.yaml. -
Delete a key:
yq -i 'del(.database.password)' config.yamlExplanation: Removes the
passwordkey from thedatabasemapping inconfig.yaml. -
Output YAML with indentation:
yq '.' config.yamlExplanation: Pretty-prints the entire
config.yamlfile. -
Convert YAML to JSON:
yq -o=json '.' config.yamlExplanation: Reads
config.yamland outputs its content as JSON. -
Convert JSON to YAML:
yq -p=json -o=yaml '.' data.jsonExplanation: Reads
data.jsonand outputs its content as YAML. -
Evaluate expressions:
yq '.users | length' users.yamlExplanation: Outputs the number of items in the
userssequence. -
Select multiple elements:
yq '.users[] | select(.id > 1)' users.yamlExplanation: Selects and outputs all user objects from the
userssequence where theidis greater than 1.
Common Patterns
-
Creating a basic YAML file:
apiVersion: v1 kind: Pod metadata: name: my-pod labels: app: my-app spec: containers: - name: nginx-container image: nginx:latest ports: - containerPort: 80Explanation: A typical Kubernetes Pod definition, demonstrating nested mappings and sequences.
-
Using anchors and aliases for DRY YAML:
default_settings: &defaults timeout: 30 retries: 5 production_config: <<: *defaults host: prod.example.com staging_config: <<: *defaults host: staging.example.comExplanation:
&defaultsdefines an anchor.*defaultsis an alias that merges the content ofdefaultsintoproduction_configandstaging_config. -
Handling multi-line strings:
description: | This is a long description that spans multiple lines. Whitespace is preserved.Explanation: The
|literal block scalar preserves newlines.long_text: > This is another long text. The newlines here will be replaced by spaces, and the entire block will become a single line.Explanation: The
>folded block scalar folds newlines into spaces, making it suitable for long paragraphs. -
Validating a Kubernetes manifest:
kubectl apply --dry-run=client -f deployment.yamlExplanation: While not strictly a YAML validator,
kubectlperforms extensive validation of Kubernetes YAML structures before applying them.yamllint deployment.yamlExplanation: A dedicated tool for checking YAML syntax and style.
-
Extracting specific data from a large YAML file using
yq:yq '.services[] | select(.name == "api-gateway") | .port' microservices.yamlExplanation: Finds the service named "api-gateway" within the
serviceslist and outputs itsport. -
Modifying multiple values in a file using
yq:yq -i '.users[].active = false' users.yamlExplanation: Sets the
activestatus tofalsefor all users in theuserslist.
Gotchas
- Tabs vs. Spaces: YAML parsers are strict about indentation. Always use spaces, never tabs. Mixing them will lead to errors. Most text editors can be configured to convert tabs to spaces.
- Colon and Space: A colon (
:) must always be followed by at least one space when separating a key from its value.key:valueis invalid;key: valueis correct. - Quoting Strings: Strings that look like numbers (
"123"), booleans ("true"), or contain special characters (:,{,},[,],,,&,*,#,?,|,-,<,>,=,!,%,@,`) might require quoting to be interpreted as strings.# Incorrect: Might be interpreted as a number version: 1.2 # Correct: Explicitly a string version: "1.2" # Incorrect: Might be interpreted as a boolean enabled: yes # Correct: Explicitly a string enabled: "yes" - Hyphens in Keys: Keys starting with a hyphen (
-) can be problematic. It’s best to quote them.# Problematic -my-key: value # Correct "-my-key": value - Duplicate Keys: In a single mapping, duplicate keys are not allowed. The behavior can vary between parsers, but it’s best to avoid them. The last occurrence usually wins, but this is not reliable.
- Block Scalar Indentation: When using
|or>, the indentation of the content lines relative to the scalar indicator matters. Lines with less indentation than the indicator are considered outside the block.# Correct indentation for literal block scalar my_string: | This is line one. This is line two.# Incorrect indentation - "This is line two." is outside the block my_string: | This is line one. This is line two. yqVersion Differences: Be aware that there are multiple tools namedyq. The most popular and actively maintained isyqby Mike Farah. Older versions (likekislyuk/yq) have different syntax and capabilities. Always check whichyqyou are using (yq --version). The examples above assume Mike Farah’syq.yq-i(in-place edit) safety: The-iflag modifies files directly. It’s a good practice to back up files or use version control before performing in-place edits.