tar Archive Tool

tar cheatsheet — create and extract archives. tar -czf archive.tar.gz dir/, tar -xvf archive.tar.gz, tar -tzf to list. gzip, bzip2, xz compression options covered.

8 min read

What it is

tar is a command-line utility for creating, listing, and extracting archive files, often referred to as "tarballs." It’s widely used for bundling multiple files and directories into a single file and for compressing them.

Installation

Linux

tar is usually pre-installed on most Linux distributions. If not:

sudo apt update && sudo apt install tar  # For Debian/Ubuntu
sudo yum install tar                   # For CentOS/RHEL
sudo dnf install tar                   # For Fedora

macOS

tar is pre-installed on macOS.

Windows

tar is available in recent versions of Windows 10 and Windows 11 through the Windows Subsystem for Linux (WSL) or as a built-in command.

# Open PowerShell or Command Prompt as Administrator
# To install via winget (if available)
winget install gnu.tar

Alternatively, you can download a pre-compiled binary from sources like GnuWin32 or use WSL.

Core Concepts

  • Archive: A single file that contains one or more other files and directories, preserving their structure and metadata.
  • Compression: tar itself does not compress files. It’s common to pipe the output of tar to a compression utility (like gzip, bzip2, or xz) or use tar’s built-in support for these compression algorithms via specific flags.
  • Modes: tar operates in different modes:
    • Create (-c): Make a new archive.
    • Extract (-x): Unpack an archive.
    • List (-t): Show the contents of an archive.
    • Update (-u): Add files to an archive if they are newer than existing files.
    • Compare (-d): Find differences between an archive and files.

Commands / Usage

Creating Archives

  • Create a simple archive:

    tar -cvf backup.tar /home/user/documents
    

    Create (-c) a verbose (-v) archive file (-f) named backup.tar containing the /home/user/documents directory.

  • Create a gzipped archive:

    tar -czvf backup.tar.gz /home/user/documents
    

    Create (-c) a verbose (-v) gzip-compressed (-z) archive file (-f) named backup.tar.gz.

  • Create a bzip2 archive:

    tar -cjvf backup.tar.bz2 /home/user/documents
    

    Create (-c) a verbose (-v) bzip2-compressed (-j) archive file (-f) named backup.tar.bz2.

  • Create an xz archive:

    tar -cJvf backup.tar.xz /home/user/documents
    

    Create (-c) a verbose (-v) xz-compressed (-J) archive file (-f) named backup.tar.xz.

  • Create an archive from a list of files:

    tar -cvf files.tar file1.txt file2.jpg dir1/
    

    Create (-c) a verbose (-v) archive file (-f) named files.tar containing file1.txt, file2.jpg, and the dir1/ directory.

  • Create an archive and exclude specific files/directories:

    tar -czvf project.tar.gz --exclude='*.log' --exclude='tmp/' /path/to/project
    

    Create (-c) a verbose (-v) gzip-compressed (-z) archive (-f) named project.tar.gz from /path/to/project, excluding all .log files and the tmp/ directory.

  • Create an archive from standard input:

    tar -czf - /path/to/data | ssh user@remote 'cat > /backup/data.tar.gz'
    

    Create (-c) a gzip-compressed (-z) archive (-f) to standard output (-) from /path/to/data, then pipe it to a remote server to save as data.tar.gz.

Extracting Archives

  • Extract a simple archive:

    tar -xvf backup.tar
    

    Extract (-x) verbosely (-v) from the archive file (-f) named backup.tar. Files are extracted into the current directory.

  • Extract a gzipped archive:

    tar -xzvf backup.tar.gz
    

    Extract (-x) verbosely (-v) from the gzip-compressed (-z) archive file (-f) named backup.tar.gz.

  • Extract a bzip2 archive:

    tar -xjvf backup.tar.bz2
    

    Extract (-x) verbosely (-v) from the bzip2-compressed (-j) archive file (-f) named backup.tar.bz2.

  • Extract an xz archive:

    tar -xJvf backup.tar.xz
    

    Extract (-x) verbosely (-v) from the xz-compressed (-J) archive file (-f) named backup.tar.xz.

  • Extract to a specific directory:

    tar -xzvf backup.tar.gz -C /path/to/extract/to
    

    Extract (-x) verbosely (-v) from backup.tar.gz into the directory specified by -C.

  • Extract specific files from an archive:

    tar -xvf backup.tar home/user/documents/report.docx
    

    Extract (-x) verbosely (-v) from backup.tar, but only the file home/user/documents/report.docx (path must match archive’s internal path).

  • Extract from standard input:

    ssh user@remote 'tar -czf - /data/backup' | tar -xzvf - -C /local/restore/path
    

    Receive a gzip-compressed tar stream (-z) from standard input (-) via SSH, and extract (-x) verbosely (-v) into /local/restore/path.

Listing Archive Contents

  • List contents of a simple archive:

    tar -tvf backup.tar
    

    List (-t) verbosely (-v) the contents of the archive file (-f) named backup.tar.

  • List contents of a gzipped archive:

    tar -tzvf backup.tar.gz
    

    List (-t) verbosely (-v) the contents of the gzip-compressed (-z) archive file (-f) named backup.tar.gz.

  • List contents of a bzip2 archive:

    tar -tjvf backup.tar.bz2
    

    List (-t) verbosely (-v) the contents of the bzip2-compressed (-j) archive file (-f) named backup.tar.bz2.

  • List contents of an xz archive:

    tar -tJvf backup.tar.xz
    

    List (-t) verbosely (-v) the contents of the xz-compressed (-J) archive file (-f) named backup.tar.xz.

Updating Archives

  • Add a new file to an existing archive:

    tar -uvf backup.tar new_file.txt
    

    Update (-u) verbosely (-v) the archive file (-f) backup.tar by adding new_file.txt. If new_file.txt already exists in backup.tar, it will be replaced only if new_file.txt on disk is newer.

  • Add a new directory to an existing archive:

    tar -uvzf backup.tar.gz added_directory/
    

    Update (-u) verbosely (-v) the gzip-compressed (-z) archive file (-f) backup.tar.gz by adding added_directory/.

Other Operations

  • Compare archive contents with files on disk:

    tar -d -f backup.tar -C /home/user/documents
    

    Compare (-d) the archive file (-f) backup.tar with the contents of /home/user/documents. Differences will be printed.

  • Update archive members (only if newer):

    tar -uvf myarchive.tar file_to_update.txt another_file.log
    

    Update (-u) verbosely (-v) the archive file (-f) myarchive.tar. For each specified file (file_to_update.txt, another_file.log), it’s added to the archive if it doesn’t exist or if the file on disk is newer than the one in the archive.

Common Patterns

  • Backup current directory to a remote server:

    tar -czvf - . | ssh user@remote_host 'cat > /backups/my_project_$(date +%Y%m%d).tar.gz'
    

    Create a compressed tarball (-czvf -) of the current directory (.) and pipe it (|) to ssh. On the remote host, cat receives the stream and redirects it (>) into a timestamped file.

  • Restore a backup from a remote server:

    ssh user@remote_host 'tar -czvf - /path/to/remote/backup.tar.gz' | tar -xzvf - -C /local/restore/path
    

    On the remote host, create a compressed tarball (-czvf -) of a backup file and pipe it to ssh. On the local machine, tar receives the stream (-) and extracts (-xzvf -) it into /local/restore/path.

  • Create a tarball without compression:

    tar -cvf archive.tar file1.txt dir1/
    

    Useful when you just need to bundle files, not compress them, perhaps for later processing or when compression is handled by another tool.

  • Extracting multiple tarballs sequentially:

    for f in *.tar.gz; do tar -xzvf "$f" -C extracted_data/; done
    

    Iterate through all .tar.gz files in the current directory and extract each one into the extracted_data/ subdirectory.

  • Handling very large files (avoiding memory issues): tar is generally efficient with disk I/O and doesn’t load entire archives into memory unless specifically instructed (which is rare). The primary concern for large files is disk space and transfer time. Using compression (-z, -j, -J) is standard practice.

  • Creating a tarball from a list of files in a file:

    tar -cvf archive.tar --files-from=list_of_files.txt
    

    Create an archive (-cvf) using the file paths listed in list_of_files.txt. Each line in list_of_files.txt should be a path to a file or directory.

  • Extracting a tarball with absolute paths (use with caution): By default, tar usually strips leading slashes from absolute paths in archives for security. To preserve them (potentially dangerous if the archive is untrusted):

    tar -xvf --absolute-names archive.tar
    

Gotchas

  • Path Handling: tar by default strips leading / from filenames when creating archives. This prevents accidentally overwriting system files during extraction. If you need to preserve absolute paths (e.g., for system backups where you intend to restore to the exact same location), use the --absolute-names or -P flag, but be extremely cautious.
  • Compression Flags: Remember that tar itself doesn’t compress. You need to use specific flags (-z for gzip, -j for bzip2, -J for xz) during creation and extraction. If you try to extract a .tar.gz file with tar -xvf, it will fail. You must use tar -xzvf. Modern tar versions often auto-detect compression, but it’s best practice to specify the flag.
  • --files-from vs. -T: The --files-from option reads filenames from a file. The -T option is a synonym for --files-from.
  • Order of Flags: While many tar implementations are lenient, the traditional order is -c or -x or -t, followed by -v, -f, and then compression flags like -z. For example, tar -xvf is standard; tar -vxvf might work but is less conventional. The -f flag usually needs to be the last option before the archive filename.
  • Extracting to Existing Directories: When extracting, tar will create directories as needed. If a file already exists in the destination, tar might overwrite it or refuse, depending on the version and flags. Be mindful of where you are extracting.
  • Permissions: tar preserves file permissions, ownership, and timestamps by default. When extracting, the user running tar must have appropriate permissions to create files and directories in the target location. Restoring ownership might require running tar as root.
  • Wildcards: Wildcards (*, ?) used on the command line to specify files for creation are expanded by the shell before tar sees them. This means tar -cvf archive.tar *.txt will add only the .txt files that exist at the time the command is run. Using --files-from is often more robust for managing lists of files. Wildcards used with --exclude are interpreted by tar.