Cassandra CQL Shell

cqlsh cheatsheet — connect to Cassandra, create keyspaces, tables, insert and query data. CREATE KEYSPACE, SELECT, INSERT with TTL. Full CQL reference with examples.

9 min read

What it is

The cqlsh tool is the command-line interface for interacting with Apache Cassandra, allowing you to execute CQL (Cassandra Query Language) commands, manage schemas, and perform administrative tasks.

Installation

cqlsh is typically installed as part of the Cassandra distribution. Ensure you have Java installed and that your Cassandra installation directory is in your PATH.

Linux/macOS: The cqlsh script is usually located in the bin directory of your Cassandra installation.

# Example: If Cassandra is installed in /usr/local/cassandra
/usr/local/cassandra/bin/cqlsh

Windows: The cqlsh.bat script is usually located in the bin directory of your Cassandra installation.

# Example: If Cassandra is installed in C:\cassandra
C:\cassandra\bin\cqlsh.bat

You can also connect to a remote Cassandra cluster:

cqlsh <remote_host_ip> <port>
# Example:
cqlsh 192.168.1.100 9042

Core Concepts

Keyspace: A namespace that contains tables and other keyspaces. Similar to a database in relational systems.
Table: A collection of rows, organized by columns.
Column: A named data type within a table.
Primary Key: Uniquely identifies a row. It can be a simple primary key (one column) or a composite primary key (multiple columns). The first part of the primary key is the partition key, which determines which node(s) the data is stored on.
Clustering Columns: Optional columns that determine the on-disk sort order of rows within a partition.
Secondary Index: An index on a column that is not part of the primary key, used to query data based on values in that column.
Materialized View: A table that automatically maintains data derived from a base table, allowing for different query patterns.
User-Defined Type (UDT): A custom data type that can be used as the type for a column.
CQL (Cassandra Query Language): The SQL-like language used to interact with Cassandra.

Commands / Usage

Connecting to Cassandra

cqlsh
# Connects to localhost on default port 9042

cqlsh 192.168.1.100
# Connects to a specific host on default port 9042

cqlsh 192.168.1.100 9142
# Connects to a specific host and port

cqlsh -u cassandra -p mypassword
# Connects with username and password authentication

cqlsh --username cassandra --password mypassword
# Long form for username and password

cqlsh -k /path/to/keystore.jks -p keystorepassword
# Connects using SSL/TLS with a keystore

cqlsh --keystore /path/to/keystore.jks --keystore-password keystorepassword
# Long form for SSL/TLS keystore

cqlsh -e "SELECT release_version FROM system.local;"
# Execute a single command and exit

Schema Management

Keyspaces

-- Create a new keyspace
CREATE KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
# Creates a keyspace named 'mykeyspace' with simple replication for a single node.

CREATE KEYSPACE mykeyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 3, 'datacenter2': 3};
# Creates a keyspace with network topology replication across two datacenters.

-- Use an existing keyspace
USE mykeyspace;
# Sets 'mykeyspace' as the default keyspace for subsequent commands.

-- Describe keyspace properties
DESCRIBE KEYSPACE mykeyspace;
# Shows the replication strategy and factor for 'mykeyspace'.

-- Alter keyspace replication
ALTER KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
# Changes the replication factor of 'mykeyspace'.

-- Drop a keyspace
DROP KEYSPACE mykeyspace;
# Deletes the keyspace 'mykeyspace' and all its contents.

Tables

-- Create a table within the current keyspace
CREATE TABLE users (
    user_id uuid PRIMARY KEY,
    first_name text,
    last_name text,
    email text
);
# Creates a 'users' table with a UUID primary key.

CREATE TABLE products (
    product_id int,
    category text,
    name text,
    price decimal,
    PRIMARY KEY (category, product_id)
);
# Creates a 'products' table with a composite primary key (partition key: category, clustering key: product_id).

CREATE TABLE sensor_data (
    sensor_id uuid,
    timestamp timestamp,
    value double,
    PRIMARY KEY (sensor_id, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC);
# Creates a 'sensor_data' table with clustering order by timestamp in descending order.

-- Describe a table
DESCRIBE TABLE users;
# Shows the schema definition for the 'users' table.

-- Alter a table (add a column)
ALTER TABLE users ADD age int;
# Adds a new 'age' column of type integer to the 'users' table.

-- Alter a table (drop a column)
ALTER TABLE users DROP email;
# Removes the 'email' column from the 'users' table.

-- Drop a table
DROP TABLE users;
# Deletes the 'users' table.

Materialized Views

CREATE MATERIALIZED VIEW active_users AS
    SELECT user_id, first_name, last_name
    FROM users
    WHERE is_active = true AND user_id IS NOT NULL AND first_name IS NOT NULL AND last_name IS NOT NULL
    PRIMARY KEY (is_active, user_id);
# Creates a materialized view 'active_users' for querying active users.

User-Defined Types (UDTs)

-- Create a UDT
CREATE TYPE address (
    street text,
    city text,
    zip_code text
);
# Defines a new UDT named 'address'.

-- Use a UDT in a table
CREATE TABLE customers (
    customer_id uuid PRIMARY KEY,
    name text,
    shipping_address address
);
# Creates a 'customers' table with a column of the UDT 'address'.

-- Update a UDT in a table
UPDATE customers SET shipping_address = {street: '123 Main St', city: 'Anytown', zip_code: '12345'} WHERE customer_id = 123e4567-e89b-12d3-a456-426614174000;
# Sets the shipping_address for a specific customer.

Data Manipulation Language (DML)

Inserting Data

-- Insert a row
INSERT INTO users (user_id, first_name, last_name, email) VALUES (uuid(), 'John', 'Doe', 'john.doe@example.com');
# Inserts a new user with a generated UUID.

INSERT INTO products (product_id, category, name, price) VALUES (101, 'Electronics', 'Laptop', 1200.50);
# Inserts a new product.

Selecting Data

-- Select all columns for all rows
SELECT * FROM users;
# Retrieves all data from the 'users' table.

-- Select specific columns
SELECT user_id, first_name FROM users;
# Retrieves only the user_id and first_name for all users.

-- Select with a WHERE clause (must use primary key components)
SELECT * FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
# Retrieves a user by their specific user_id.

SELECT * FROM products WHERE category = 'Electronics';
# Retrieves all products in the 'Electronics' category.

SELECT * FROM products WHERE category = 'Electronics' AND product_id = 101;
# Retrieves a specific product within a category.

-- Select with LIMIT
SELECT * FROM users LIMIT 10;
# Retrieves the first 10 users.

-- Select with ORDER BY (requires clustering columns)
SELECT * FROM products WHERE category = 'Electronics' ORDER BY price DESC LIMIT 5;
# Retrieves the 5 most expensive electronics products.

Updating Data

-- Update a column for a specific row
UPDATE users SET email = 'john.doe.updated@example.com' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
# Updates the email address for a specific user.

-- Update multiple columns
UPDATE users SET first_name = 'Jonathan', last_name = 'Smith' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
# Updates both first and last names for a user.

Deleting Data

-- Delete a specific row
DELETE FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
# Deletes a user by their user_id.

-- Delete specific columns for a row
DELETE FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 AND first_name;
# Deletes only the first_name column for a specific user.

-- Delete all rows in a table (use with caution!)
TRUNCATE TABLE users;
# Removes all data from the 'users' table.

Batch Operations

BEGIN BATCH
    INSERT INTO users (user_id, first_name, last_name) VALUES (uuid(), 'Alice', 'Smith');
    INSERT INTO users (user_id, first_name, last_name) VALUES (uuid(), 'Bob', 'Johnson');
APPLY BATCH;
# Executes multiple DML statements as a single atomic operation.

BEGIN UNLOGGED BATCH
    INSERT INTO products (product_id, category, name, price) VALUES (102, 'Electronics', 'Keyboard', 75.00);
    INSERT INTO products (product_id, category, name, price) VALUES (103, 'Electronics', 'Mouse', 25.00);
APPLY BATCH;
# Executes multiple DML statements as an unlogged batch (not atomic, but faster).

Shell Commands

cqlsh> help
# Displays a list of available cqlsh commands.

cqlsh> help <command>
# Displays help for a specific cqlsh command (e.g., help DESCRIBE).

cqlsh> DESCRIBE SCHEMA;
# Shows all keyspaces and their tables.

cqlsh> DESCRIBE CLUSTERING KEY SCHEMA;
# Shows the schema for tables with clustering keys.

cqlsh> DESCRIBE KEYSPACES;
# Lists all available keyspaces.

cqlsh> DESCRIBE TABLES;
# Lists all tables in the current keyspace.

cqlsh> DESCRIBE CLUSTER NAME;
# Shows the name of the Cassandra cluster.

cqlsh> DESCRIBE RING;
# Shows the status of nodes in the cluster.

cqlsh> CONSISTENCY LEVEL;
# Shows the current consistency level.

cqlsh> CONSISTENCY LEVEL LOCAL_QUORUM;
# Sets the consistency level to LOCAL_QUORUM for subsequent queries.

cqlsh> TIMEOUT;
# Shows the current query timeout.

cqlsh> TIMEOUT 10000;
# Sets the query timeout to 10 seconds.

cqlsh> EXIT;
# Exits the cqlsh shell.

Common Patterns

Creating a Keyspace and Table for Development

cqlsh -e "CREATE KEYSPACE IF NOT EXISTS myapp WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"
cqlsh -e "USE myapp;"
cqlsh -e "CREATE TABLE IF NOT EXISTS users (user_id uuid PRIMARY KEY, username text, email text);"

Inserting and Querying Data in One Go

cqlsh -e "USE mykeyspace; INSERT INTO users (user_id, first_name, last_name) VALUES (uuid(), 'Jane', 'Smith');"
cqlsh -e "USE mykeyspace; SELECT * FROM users WHERE user_id = $(cqlsh -e 'SELECT user_id FROM users LIMIT 1;')" # This is illustrative, actual UUID retrieval needs more complex scripting.

A more practical approach for scripting:

# Get a UUID and store it
USER_ID=$(uuidgen)
cqlsh -e "USE mykeyspace; INSERT INTO users (user_id, first_name, last_name) VALUES ($USER_ID, 'Jane', 'Smith');"
cqlsh -e "USE mykeyspace; SELECT * FROM users WHERE user_id = $USER_ID;"

Exporting Schema

cqlsh -e "DESCRIBE SCHEMA;" > schema.cql

Importing Schema

cqlsh -f schema.cql

Connecting to a Remote Cluster and Running a Query

cqlsh 192.168.1.100 -u admin -p password -e "USE mykeyspace; SELECT COUNT(*) FROM mytable;"

Changing Consistency Level for a Specific Query

cqlsh> CONSISTENCY LEVEL EACH_QUORUM;
cqlsh> SELECT * FROM mytable WHERE id = 1;
cqlsh> CONSISTENCY LEVEL LOCAL_QUORUM; # Reset to previous level

Gotchas

Primary Key Restrictions: You can only query directly on partition key components. For querying on other columns, you generally need to create secondary indexes or materialized views.
WHERE Clause Limitations: When querying a table, WHERE clauses must include at least the partition key. If the table has clustering columns, you can filter on them sequentially after the partition key.
Tombstones: Deleting data in Cassandra marks it for deletion (tombstones). These tombstones are not immediately removed and can impact read performance until compaction.
UPDATE vs. INSERT: In Cassandra, UPDATE and INSERT operations are often the same. If a row with the specified primary key exists, it’s updated; otherwise, it’s inserted.
Batch Limitations: Unlogged batches are not atomic. If one statement in an unlogged batch fails, others might still succeed. Logged batches are atomic but have performance implications, especially for large batches. Avoid large batches.
Data Types: Be mindful of Cassandra’s data types (e.g., timestamp, timeuuid, uuid, inet, blob, text, varchar). Mismatched types can lead to errors.
Replication Strategy: Choose your replication strategy (SimpleStrategy for single data centers/dev, NetworkTopologyStrategy for multi-data center production) carefully. It cannot be changed for an existing keyspace without recreating it.
ALLOW FILTERING: Using ALLOW FILTERING in a SELECT query allows you to filter on non-indexed columns or columns not part of the primary key. This can be very inefficient and should be used sparingly, preferably only in development or for very small datasets. It’s a sign that your data model might need adjustment.
Case Sensitivity: Table and column names are generally case-insensitive unless quoted. Keyspace names are case-insensitive. CQL keywords are case-insensitive.