Keyspaces, Tables, Columns, Primary Keys, and Data Modeling Basics
Learn how Cassandra organizes data and why primary key design is the center of good modeling.
Inside this chapter
- Keyspaces and Tables
- Creating a Keyspace
- Primary Key Structure Matters Most
- Modeling Starts with Queries
Series navigation
Study the chapters in order for the clearest path from beginner Cassandra concepts to advanced distributed operations. Use the navigation at the bottom of each page to move through the full series.
Keyspaces and Tables
A keyspace in Cassandra is similar to a top-level namespace or database container, but it also includes replication settings. Tables hold data, but unlike relational systems, their structure is usually designed around how the application will query data, not around normalized relationships first.
Creating a Keyspace
CREATE KEYSPACE ecommerce
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1
};
This basic example is fine for local learning. Production systems often use data-center-aware replication strategies instead.
Primary Key Structure Matters Most
CREATE TABLE user_activity (
user_id UUID,
activity_time TIMESTAMP,
activity_type TEXT,
details TEXT,
PRIMARY KEY ((user_id), activity_time)
) WITH CLUSTERING ORDER BY (activity_time DESC);
In Cassandra, the primary key has two major parts: the partition key and the clustering columns. The partition key controls data distribution across the cluster. Clustering columns control ordering within a partition. This is one of the most important beginner-to-intermediate concepts in Cassandra.
Modeling Starts with Queries
Instead of starting with entities and normalizing everything, Cassandra modeling often starts by asking: what exact queries must the application support? If a service needs to fetch recent activity by user, the table should be shaped directly for that access pattern.