Understanding and Implementing Declarative Table Partitioning in PostgreSQL 14+

Explore PostgreSQL's native declarative partitioning feature in versions 14+. This guide details range, list, and hash partitioning types, offering practical SQL examples for creating and managing partitioned tables. Learn how to optimize query performance and simplify data management for very large datasets by leveraging partition pruning and efficient maintenance strategies.

Understanding and Implementing Declarative Table Partitioning in PostgreSQL 14+

PostgreSQL partitioning is worth considering when one table has become awkward to query, vacuum, archive, or delete from. The usual example is an events table that receives millions of rows per day and is almost always queried by time range. Without partitioning, even good indexes can leave you with a table that is expensive to maintain and painful to age out.

Declarative partitioning lets one logical table route rows into smaller physical tables called partitions. PostgreSQL 10 introduced the native syntax, and later versions improved planning, pruning, indexing, and maintenance behavior. PostgreSQL 14+ is mature enough that many teams can use partitioning without trigger-based inheritance schemes, but it still rewards careful design. A bad partition key can make the system more complicated without making it faster.

What is Declarative Table Partitioning?

Declarative partitioning is a database feature that enables you to split a single logical table (the parent or partitioned table) into multiple physical tables (child or partition tables) based on a defined set of rules. Each partition holds a subset of the data from the parent table. The partitioning key determines which partition a row belongs to.

Key benefits of declarative partitioning include:

  • Improved Query Performance: Queries that filter on the partitioning key can be faster because PostgreSQL can prune partitions that cannot contain matching rows.
  • Easier Data Management: Operations like deleting old data or archiving can be performed much more efficiently by detaching or dropping individual partitions rather than performing massive DELETE operations on a single large table.
  • Simplified Maintenance: Indexing and vacuuming can be managed on a per-partition basis, reducing the impact on the entire table.
  • Smaller Maintenance Units: Partition-level indexes, detach operations, and targeted vacuum work can reduce the blast radius of routine maintenance.

Types of Declarative Partitioning

PostgreSQL supports several methods for declarative partitioning, each suited to different data distribution patterns:

1. Range Partitioning

Range partitioning divides data based on a continuous range of values in a specific column (e.g., dates, numbers).

Use Case: Ideal for time-series data, such as logs, event data, or sales records, where you frequently query data within specific date or numerical ranges.

Example: Partitioning a sales table by the sale_date column.

Creating a Range Partitioned Table

First, create the parent table, specifying the partitioning method and key:

CREATE TABLE sales (
    sale_id SERIAL,
    product_name VARCHAR(100),
    sale_amount NUMERIC(10, 2),
    sale_date DATE NOT NULL
)
PARTITION BY RANGE (sale_date);

Next, create the individual partitions. Each partition is defined with a FOR VALUES clause specifying the range it will contain.

-- Partition for sales in January 2023.
-- The upper bound is exclusive, so this includes January 31.
CREATE TABLE sales_2023_01
    PARTITION OF sales ()
    FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');

-- Partition for sales in February 2023
CREATE TABLE sales_2023_02
    PARTITION OF sales ()
    FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');

-- Partition for sales in March 2023
CREATE TABLE sales_2023_03
    PARTITION OF sales ()
    FOR VALUES FROM ('2023-03-01') TO ('2023-04-01');

Tip: When defining ranges, ensure they are contiguous and cover all possible values. Avoid overlapping ranges. The TO value is exclusive.

2. List Partitioning

List partitioning divides data based on a discrete list of values in a column.

Use Case: Suitable for columns with a fixed, known set of values, such as geographical regions, status codes, or product categories.

Example: Partitioning a orders table by the region column.

Creating a List Partitioned Table

Define the parent table with PARTITION BY LIST:

CREATE TABLE orders (
    order_id SERIAL,
    customer_name VARCHAR(100),
    order_total NUMERIC(10, 2),
    region VARCHAR(50) NOT NULL
)
PARTITION BY LIST (region);

Create partitions for specific regions:

-- Partition for orders in 'North America'
CREATE TABLE orders_north_america
    PARTITION OF orders ()
    FOR VALUES IN ('North America');

-- Partition for orders in 'Europe'
CREATE TABLE orders_europe
    PARTITION OF orders ()
    FOR VALUES IN ('Europe');

-- Partition for orders in 'Asia'
CREATE TABLE orders_asia
    PARTITION OF orders ()
    FOR VALUES IN ('Asia');

Important: If you insert a value for region that doesn't match any existing partition's IN list and there's no DEFAULT partition, the insert will fail. You can create a DEFAULT partition to catch all other values.

Creating a Default Partition

-- Default partition for any region not explicitly listed
CREATE TABLE orders_other
    PARTITION OF orders ()
    DEFAULT;

3. Hash Partitioning

Hash partitioning distributes data across a number of partitions based on a hash value of the partitioning key.

Use Case: Useful when you have a large volume of data and want to distribute it evenly across partitions without a clear range or list-based distribution. It's good for load balancing.

Example: Partitioning a users table by user_id.

Creating a Hash Partitioned Table

Define the parent table with PARTITION BY HASH and specify the number of partitions:

CREATE TABLE users (
    user_id BIGSERIAL,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100)
)
PARTITION BY HASH (user_id);

PostgreSQL will automatically create partitions for you if you don't specify them, but it's generally recommended to create them explicitly, especially when you want control over the number and naming of partitions.

Creating Explicit Hash Partitions

-- Create 4 hash partitions
CREATE TABLE users_p0
    PARTITION OF users
    FOR VALUES WITH (modulus 4, remainder 0);

CREATE TABLE users_p1
    PARTITION OF users
    FOR VALUES WITH (modulus 4, remainder 1);

CREATE TABLE users_p2
    PARTITION OF users
    FOR VALUES WITH (modulus 4, remainder 2);

CREATE TABLE users_p3
    PARTITION OF users
    FOR VALUES WITH (modulus 4, remainder 3);

Note: When using hash partitioning, you need to specify the modulus (the total number of partitions) and the remainder (which partition this one is).

Implementing Declarative Partitioning: Best Practices

  • Choose the Right Partitioning Key: The partitioning key should align with your most frequent query filters and data management operations. A good key significantly improves performance.
  • Consider the Number of Partitions: Too few partitions might not provide enough benefit, while too many can increase overhead. Aim for a number that balances manageability and performance. For range partitioning, consider your data growth rate and retention policies.
  • Use pg_partman for Automation: For range partitioning, especially with time-series data, consider using extensions like pg_partman. It automates the creation of new partitions and the archival/dropping of old ones, significantly reducing manual effort.
  • Index Strategically: Indexes are physically stored per partition. Creating an index on the parent creates matching partition indexes, but you should still check whether each partition needs the same index pattern.
  • Partition Pruning: Ensure your queries are written to leverage partition pruning by including the partitioning key in WHERE clauses. The EXPLAIN command can show if pruning is occurring.
  • DEFAULT Partitions: For list partitioning, a DEFAULT partition is crucial to avoid insertion errors if new values appear unexpectedly.
  • Unique Constraints: A unique constraint or primary key on a partitioned table generally must include all partition key columns. This catches many first-time designs by surprise.
  • Data Types: Ensure the data type of the partitioning key is appropriate and consistent across the parent and child tables.

Managing Partitions

Attaching and Detaching Partitions

While partitions are created directly via CREATE TABLE ... PARTITION OF ..., you can also detach and attach existing tables as partitions. This is useful for migrating data or managing large datasets.

Detaching a Partition: Detaching turns the partition into a regular table while keeping its data.

-- Detach the sales_2023_01 partition
ALTER TABLE sales DETACH PARTITION sales_2023_01;

Attaching a Table as a Partition: You can attach a regular table that conforms to the parent's schema and has data that fits the partition bounds.

-- Assume sales_2022_12 is a regular table with the same columns as sales
-- and only rows from December 2022.
ALTER TABLE sales ATTACH PARTITION sales_2022_12
    FOR VALUES FROM ('2022-12-01') TO ('2023-01-01');

Before attaching a large table, add a matching CHECK constraint first. PostgreSQL can use that constraint to avoid scanning the whole table to prove the rows fit the partition bounds.

Dropping Partitions

Dropping a partition is a fast operation as it only removes the partition table, not the data within it (unless explicitly specified). This is much faster than DELETE.

-- To drop a partition, you can simply drop the child table
DROP TABLE sales_2023_01;

Example: Improving Query Performance with Partition Pruning

Consider the sales table partitioned by sale_date as shown earlier.

Query without partition pruning (hypothetical on a non-partitioned table):

SELECT SUM(sale_amount)
FROM sales
WHERE sale_date >= '2023-01-15' AND sale_date < '2023-01-20';

If sales were a massive, unpartitioned table, this query would scan the entire table. However, with declarative partitioning:

-- This query will only scan the sales_2023_01 partition
SELECT SUM(sale_amount)
FROM sales
WHERE sale_date >= '2023-01-15' AND sale_date < '2023-01-20';

PostgreSQL's query planner recognizes that sale_date is the partitioning key and that the specified range falls entirely within the sales_2023_01 partition. It will therefore only scan that partition, drastically reducing I/O and improving performance.

To verify this, use EXPLAIN:

EXPLAIN SELECT SUM(sale_amount) FROM sales WHERE sale_date >= '2023-01-15' AND sale_date < '2023-01-20';

The output should show only the relevant partition, or it may show removed subplans depending on the PostgreSQL version and plan shape. The important sign is that unrelated partitions are not scanned.

A Practical Design Checklist

Partition only when you can name the operational win. "The table is large" is not enough by itself. A large table with well-indexed point lookups may be fine. Partitioning makes more sense when most queries include the partition key, when old data is regularly archived or dropped, or when maintenance on one huge table is already causing pain.

For time-series tables, choose partition sizes that match your query and retention patterns. Daily partitions are useful for very high ingest and short retention. Monthly partitions are often easier to manage for moderate event volume. Too many tiny partitions can slow planning and make maintenance noisy; too few giant partitions may not solve the original problem.

Plan inserts before you ship. If rows can arrive late, keep older partitions available long enough to receive them. If the partition key can contain unexpected values, create a DEFAULT partition and monitor it. A default partition should be a safety net, not a place where forgotten data quietly accumulates for months.

Finally, test with real query shapes. Partition pruning works best when the WHERE clause exposes the partition key plainly, such as sale_date >= '2023-01-01' AND sale_date < '2023-02-01'. Wrapping the key in functions can make pruning harder:

-- Less friendly to pruning
WHERE date_trunc('month', sale_date) = DATE '2023-01-01';

-- Easier for the planner
WHERE sale_date >= DATE '2023-01-01'
  AND sale_date <  DATE '2023-02-01';

Declarative partitioning is a maintenance tool as much as a query tool. Used well, it makes old data cheap to remove and hot data easier to scan. Used casually, it adds more tables, more indexes, and more edge cases. Start with the access pattern, choose the partition key from that pattern, and verify the plan before calling the design finished.

For an existing large table, do not plan a risky one-shot conversion during peak traffic. A common migration path is to create a new partitioned table, copy data in chunks, keep new writes flowing through application logic or a carefully tested trigger, then swap names during a short maintenance window. The exact approach depends on write volume and downtime tolerance, but the principle is the same: prove the copy, prove the constraints, and rehearse the cutover before touching production.