YAML Syntax Reference

YAML syntax reference cheatsheet — scalars, lists, maps, multiline strings, anchors (&), aliases (*), and null values. All YAML features with examples for config files.

7 min read

What it is

A quick reference for YAML syntax, commonly used for configuration files and data serialization, to help you write and validate YAML documents.

Installation

YAML is a human-readable data serialization standard, not a command-line tool that needs installation. You’ll typically use a text editor to write YAML files. However, you might use tools like yamllint for validation or yq for processing YAML from the command line.

For yamllint (validation):

  • Linux (Debian/Ubuntu):
    sudo apt update && sudo apt install yamllint
    
  • Linux (Fedora):
    sudo dnf install yamllint
    
  • macOS:
    brew install yamllint
    
  • Windows (using Chocolatey):
    choco install yamllint
    

For yq (processing):

  • Linux (using curl):
    curl -s "https://api.github.com/repos/mikefarah/yq/releases/latest" | grep "browser_download_url.*linux_amd64" | cut -d '"' -f 4 | wget -O yq -q
    chmod +x yq
    sudo mv yq /usr/local/bin/
    
  • macOS (using Homebrew):
    brew install yq
    
  • Windows (using Chocolatey):
    choco install yq
    

Core Concepts

  • Indentation: YAML uses spaces (not tabs) for indentation to define structure. Consistent indentation is crucial. Typically, 2 spaces per level is recommended.
  • Key-Value Pairs: The fundamental building block. A key followed by a colon and a space, then the value.
    name: John Doe
    age: 30
    
  • Sequences (Lists/Arrays): Represented by items starting with a hyphen (-) and a space.
    fruits:
      - apple
      - banana
      - cherry
    
  • Mappings (Dictionaries/Objects): Collections of key-value pairs. Can be nested.
    person:
      name: Jane Smith
      address:
        street: 123 Main St
        city: Anytown
    
  • Scalars: Primitive data types like strings, numbers, booleans, and null.
    • Strings: Can be quoted or unquoted. Unquoted strings are preferred when they don’t contain special characters or start with reserved YAML characters.
      message: Hello, world!
      quoted_message: "This is a quoted string."
      multi_line: |
        This is the first line.
        This is the second line.
      folded_line: >
        This is a long line
        that will be folded into
        a single line with spaces.
      
    • Numbers: Integers and floating-point numbers are represented directly.
      count: 100
      price: 19.99
      
    • Booleans: true or false (case-insensitive in older versions, but true/false is standard).
      enabled: true
      active: false
      
    • Null: null or ~ represents a null value.
      optional_field: null
      empty_value: ~
      
  • Comments: Start with a hash symbol (#).
    # This is a comment
    setting: value
    
  • Document Separator: --- separates multiple YAML documents within a single file.
  • Document End Marker: ... can be used to explicitly end a YAML document.

Commands / Usage

Using yamllint (Validation):

  • Validate a single file:

    yamllint config.yaml
    

    Explanation: Checks config.yaml for syntax errors and style issues based on its default rules or a specified configuration.

  • Validate multiple files:

    yamllint file1.yaml file2.yaml
    

    Explanation: Validates both file1.yaml and file2.yaml.

  • Specify a configuration file:

    yamllint -c .yamllint.yml config.yaml
    

    Explanation: Uses the rules defined in .yamllint.yml to validate config.yaml.

  • Explain a rule:

    yamllint --explain rule-name
    

    Example:

    yamllint --explain indentation
    

    Explanation: Provides a detailed description of the indentation rule.

Using yq (Processing - mikefarah/yq version):

  • Read a value from a YAML file:

    yq '.users[0].name' users.yaml
    

    Explanation: Extracts the value of the name key from the first object in the users sequence in users.yaml.

  • Read a value with a specific key:

    yq '.database.port' config.yaml
    

    Explanation: Extracts the value of the port key nested under database in config.yaml.

  • Update a value in a YAML file:

    yq -i '.database.host = "localhost"' config.yaml
    

    Explanation: Updates the host key under database to "localhost" directly within config.yaml (in-place edit).

  • Add a new key-value pair:

    yq -i '.server.timeout = 60' config.yaml
    

    Explanation: Adds a new key timeout with value 60 under the server mapping in config.yaml. If server doesn’t exist, it will be created.

  • Append to a sequence:

    yq -i '.modules += "new_module"' config.yaml
    

    Explanation: Appends the string "new_module" to the sequence named modules in config.yaml.

  • Create a new sequence element:

    yq -i '.users += {"name": "Charlie", "id": 3}' users.yaml
    

    Explanation: Appends a new dictionary object to the users sequence in users.yaml.

  • Delete a key:

    yq -i 'del(.database.password)' config.yaml
    

    Explanation: Removes the password key from the database mapping in config.yaml.

  • Output YAML with indentation:

    yq '.' config.yaml
    

    Explanation: Pretty-prints the entire config.yaml file.

  • Convert YAML to JSON:

    yq -o=json '.' config.yaml
    

    Explanation: Reads config.yaml and outputs its content as JSON.

  • Convert JSON to YAML:

    yq -p=json -o=yaml '.' data.json
    

    Explanation: Reads data.json and outputs its content as YAML.

  • Evaluate expressions:

    yq '.users | length' users.yaml
    

    Explanation: Outputs the number of items in the users sequence.

  • Select multiple elements:

    yq '.users[] | select(.id > 1)' users.yaml
    

    Explanation: Selects and outputs all user objects from the users sequence where the id is greater than 1.

Common Patterns

  • Creating a basic YAML file:

    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod
      labels:
        app: my-app
    spec:
      containers:
      - name: nginx-container
        image: nginx:latest
        ports:
        - containerPort: 80
    

    Explanation: A typical Kubernetes Pod definition, demonstrating nested mappings and sequences.

  • Using anchors and aliases for DRY YAML:

    default_settings: &defaults
      timeout: 30
      retries: 5
    
    production_config:
      <<: *defaults
      host: prod.example.com
    
    staging_config:
      <<: *defaults
      host: staging.example.com
    

    Explanation: &defaults defines an anchor. *defaults is an alias that merges the content of defaults into production_config and staging_config.

  • Handling multi-line strings:

    description: |
      This is a long description
      that spans multiple lines.
      Whitespace is preserved.
    

    Explanation: The | literal block scalar preserves newlines.

    long_text: >
      This is another long text.
      The newlines here will be
      replaced by spaces, and the
      entire block will become a single line.
    

    Explanation: The > folded block scalar folds newlines into spaces, making it suitable for long paragraphs.

  • Validating a Kubernetes manifest:

    kubectl apply --dry-run=client -f deployment.yaml
    

    Explanation: While not strictly a YAML validator, kubectl performs extensive validation of Kubernetes YAML structures before applying them.

    yamllint deployment.yaml
    

    Explanation: A dedicated tool for checking YAML syntax and style.

  • Extracting specific data from a large YAML file using yq:

    yq '.services[] | select(.name == "api-gateway") | .port' microservices.yaml
    

    Explanation: Finds the service named "api-gateway" within the services list and outputs its port.

  • Modifying multiple values in a file using yq:

    yq -i '.users[].active = false' users.yaml
    

    Explanation: Sets the active status to false for all users in the users list.

Gotchas

  • Tabs vs. Spaces: YAML parsers are strict about indentation. Always use spaces, never tabs. Mixing them will lead to errors. Most text editors can be configured to convert tabs to spaces.
  • Colon and Space: A colon (:) must always be followed by at least one space when separating a key from its value. key:value is invalid; key: value is correct.
  • Quoting Strings: Strings that look like numbers ("123"), booleans ("true"), or contain special characters (:, {, }, [, ], ,, &, *, #, ?, |, -, <, >, =, !, %, @, `) might require quoting to be interpreted as strings.
    # Incorrect: Might be interpreted as a number
    version: 1.2
    
    # Correct: Explicitly a string
    version: "1.2"
    
    # Incorrect: Might be interpreted as a boolean
    enabled: yes
    
    # Correct: Explicitly a string
    enabled: "yes"
    
  • Hyphens in Keys: Keys starting with a hyphen (-) can be problematic. It’s best to quote them.
    # Problematic
    -my-key: value
    
    # Correct
    "-my-key": value
    
  • Duplicate Keys: In a single mapping, duplicate keys are not allowed. The behavior can vary between parsers, but it’s best to avoid them. The last occurrence usually wins, but this is not reliable.
  • Block Scalar Indentation: When using | or >, the indentation of the content lines relative to the scalar indicator matters. Lines with less indentation than the indicator are considered outside the block.
    # Correct indentation for literal block scalar
    my_string: |
      This is line one.
      This is line two.
    
    # Incorrect indentation - "This is line two." is outside the block
    my_string: |
      This is line one.
    This is line two.
    
  • yq Version Differences: Be aware that there are multiple tools named yq. The most popular and actively maintained is yq by Mike Farah. Older versions (like kislyuk/yq) have different syntax and capabilities. Always check which yq you are using (yq --version). The examples above assume Mike Farah’s yq.
  • yq -i (in-place edit) safety: The -i flag modifies files directly. It’s a good practice to back up files or use version control before performing in-place edits.