Powering PostgreSQL 15 With Columnar Tables

Prologue

This blog is the result of my investigation into column-wise tables. Hopefully, you will find it useful in your deliberations.

When I started writing it was meant to be a “Good News” blurb. I was pretty optimistic that I’d truly have impressive numbers to share but as it turns out, while there is a potential for significant advantage, a detailed testing regime of the CITUS columnar store extension should be carried out before implementing them into your own environment.

The missing sizzle

Sizzle: A phrase used to show affirmation or approval. It references the sound that one hears when cooking certain delicious foods i.e. bacon or fajitas and therefore transfers those feelings to non-edible objects or ideas.

There’s a lot to take in and I’m often impressed with every new version of PostgreSQL that is released. Nevertheless, there’s been a long-standing feature that, to be honest, I’ve always felt was missing. This weakness is being addressed by the industry. Ironically, some of these implementations are in fact Postgres derivatives.

What is this sizzling feature that I speak of you ask?

Column-wise tables!

When to consider column-wise tables

Consider using the CITUS Columnar Extension under one or more of the following conditions:

In regards to table attributes:
- they are wide with “many” columns
- size should at a minimum be in the tens of GB
- the typical row is byte “heavy”
System space is at a premium i.e. it’s worth your while to manage space utilization as much as possible.
OLAP is a major component of overall activity i.e. lots of different kinds of SELECT statements.
INSERT performance is not a priority.
Indexes are not feasible
Creating EXPRESSION indexes on columnar tables is faster by orders of magnitude than on heap tables.

CAVEAT: You cannot perform UPDATE, DELETE operations on a columnar table.

OLTP vs OLAP

Let’s get back to basics. In the database world there are essentially two types of database operations:

1. Online Transaction Processing, OLTP: Online transaction processing applications have high throughput and are insert- or update-intensive in database management. These applications are used concurrently by hundreds of users. The key goals of OLTP applications are availability, speed, concurrency, and recoverability.
2. Online analytical processing, OLAP: Online analytical processing applications enable users to analyze multidimensional data interactively from multiple perspectives which consist of three basic analytical operations:
  1. consolidation (roll-up)
  2. drill-down
  3. slicing and dicing

About columnar tables

As an RDBMS, PostgreSQL is geared more toward OLTP than OLAP operations. Manipulating data by inserting, updating, deleting, etc., it does well. But while eminently capable of performing OLAP, it’s not quite as efficient. The primary reason is actually a pretty common characteristic among most RDBMS i.e. it stores and processes collections of data as individual records, otherwise known as row-wise processing.

Suppose you are performing a deep-dive analysis involving a table of records containing 50 columns. And further, suppose that your query only needs a single column. It still needs to read all of the data from those 50 columns per row in order to access that single column. And if the query processes 100 million rows, that’s definitely not trivial!

Now let’s consider a reorientation of the table i.e. column-wise. In this case, the query only reads one column and not 50. The result is that the query is now much lighter, requiring less IO and processing power yet achieving the same result in significantly less time.

Introducing the CITUS columnar extension

As a General Purpose Database Management System, basic behaviors can be reconfigured in Postgres for different purposes. And as such, it is possible to enhance PostgreSQL with columnar tables by using the CITUS columnar table extension.

The CITUS columnar extension is just one part of a larger set of capabilities of this extension that when fully implemented creates a fully scalable distributed Postgres database system.

The CITUS columnar extension feature set includes:

Highly compressed tables:
- Reduces storage requirements
- Reduces the IO needed to scan the table
Projection Pushdown: Only those column(s) within the SELECT are returned further reducing IO.
Chunk Group Filtering: Allows queries to skip over whole groups of unrelated data without processing them.

The complete CITUS feature set which, except for the Columnar storage component, is not covered in this blog, includes:

Distributed tables
References tables
Distributed query engine routes and parallelizes SELECT, DML, and operations
Columnar storage
- compresses data
- speeds up scans,
- supports fast projections
Query from any node

Getting it

This is a bit of a bad-news, good-news, excellent-news situation.

BAD-NEWS: At the time of writing this blog, the only publicly available packages on the Postgres repositories that I could find were the cstore foreign data wrappers which cover only version 12 and older versions of Postgres.

GOOD-NEWS: The CITUS extension is available on the Citus Data download page.

EXCELLENT-NEWS: As part of the process of writing this blog custom, DEB and RPM packages were created for PostgreSQL version 15 for CENTOS 7, 8, and Ubuntu 18.04, 20.04 and which are available for download from github HERE.

First-time installation

Assuming you opted for the easy way, installing the packages made for this blog:

Go to the PostgreSQL download page and configure your packaging system for your Linux distribution.
Download the appropriate custom-made columnar package for your version of Linux:

# Ubuntu 18.04
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus-columnar_11.1-1-UBUNTU1804_amd64.deb # Ubuntu 20.04
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus-columnar_11.1-1-UBUNTU2004_amd64.deb # Centos 7
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus_columnar-11.1-1-CENTOS7.x86_64.rpm # Centos 8
wget https://github.com/rbernierZulu/columnar-tables-extension/blob/main/postgresql-15-citus_columnar-11.1-1-CENTOS8.x86_64.rpm

Install the package: note that all the dependent packages, including the Postgres server, will be automatically installed onto your host. Pay attention to CENTOS 7 as it requires the epel-release repository too.

# Ubuntu 18.04
apt update
apt-get install ./postgresql-15-citus-columnar_11.1-1-UBUNTU1804_amd64.deb # Ubuntu 20.04
apt update
apt-get install ./postgresql-15-citus-columnar_11.1-1-UBUNTU2004_amd64.deb # Centos 7
# ATTENTION: the epel package must be installed beforehand!
yum install epel-release
yum install ./postgresql-15-citus_columnar-11.1-1-CENTOS7.x86_64.rpm # Centos 8
dnf -qy module disable postgresql
dnf install ./postgresql-15-citus_columnar-11.1-1-CENTOS7.x86_64.rpm
dnf check-update

Initialize the postgres data cluster and configure appropriately for your distribution of Linux in order to login to postgres.

A working example of columnar tables

What follows here are the results of my analysis. Of course, there’s always more that can be said. Hopefully, this will give you enough of an overview of the possibilities of this extension.

Installing the extension into your database is pretty much standard fare:

-- create your extension in the database:
create database db01;
\c db01
create extension citus_columnar;

Here are two tables, of type HEAP and COLUMNAR, that will be used for the initial investigations. You’ll notice that it took less time to create the regular HEAP accessed table than the columnar table:

drop table if exists t1,t2; \timing -- Time: 7628.612 ms (00:07.629)
create table if not exists t1(id,qty) using heap as select (random()*10)::int, (random()*1000)::int from generate_series(1,10e6); -- Time:15635.768 ms (00:15.636)
create table if not exists t2(id,qty) using columnar as select (random()*10)::int, (random()*1000)::int from generate_series(1,10e6);

Table "public.t1"
Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------
id | integer | | | qty | integer | | | Access method: heap Table "public.t2"
Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------
id | integer | | | qty | integer |
Access method: columnar

The number of records is 10 million:

with a(row_wise) as (select count(*) from t1), b(column_wise) as (select count(*) from t2)
select row_wise,column_wise from a,b; row_wise | column_wise
----------+------------- 9900001 | 9900001

This is very cool, the columnar compression does a great job of shrinking the table:

\dt+ t[12] List of relations Schema | Name | Type | Owner | Persistence | Access method | Size |
--------+------+-------+----------+-------------+---------------+--------+ public | t1 | table | postgres | permanent | heap | 346 MB | public | t2 | table | postgres | permanent | columnar | 27 MB |

SQL statements, preliminary

Let’s begin by comparing basic administration and SELECT statements of a HEAP vs COLUMNAR table.

Examining the tabulated results you’ll see that much of the performance times indicate that columnar tables either perform, at best, similarly to that of a HEAP table but most of the time they take more time executing the same operations.

Using a psql session, the following SQL statements are executed and examined for performance differences:

SQL	Timings
— HEAP TABLE create table if not exists t1(id,qty) using heap as select (random()10)::int, (random()1000)::int from generate_series(1,10e6);	7.6s 15.6s
— COLUMNAR TABLE create table if not exists t2(id,qty) using columnar as select (random()10)::int, (random()1000)::int from generate_series(1,10e6);	7.6s 15.6s
— HEAP TABLE, adding 5 million records do $$ begin for i in 5.1e6..10e6 loop insert into t1 values((random()10)::int,(random()1000)::int); end loop; end $$;	13.7s 18.5s
— COLUMNAR TABLE, adding 5 million records do $$ begin for i in 5.1e6..10e6 loop insert into t2 values((random()10)::int,(random()1000)::int); end loop; end $$;	13.7s 18.5s
— HEAP TABLE create index on t1(id);	4.9s 7.8s
— HEAP TABLE create index on t2(id);	4.9s 7.8s

SQL statement query plans

Part one (10 million records, two column tables)

Using the aforementioned table definitions, the following metrics were generated with the runtime parameter max_parallel_workers_per_gather = 4.

It’s quite evident that, at least for these two tables, there’s no performance benefit of a columnar accessed table over a regular heap accessed one:

SQL	Timings
— HEAP TABLE explain analyze select id,qty from t1;	742.411 ms 914.096 ms
— COLUMNAR TABLE explain analyze select id,qty from t2;	742.411 ms 914.096 ms
— HEAP TABLE explain analyze select id,qty from t1 order by random();	6441.425 ms 5871.620 ms
— COLUMNAR TABLE explain analyze select id,qty from t2 order by random();	6441.425 ms 5871.620 ms
— HEAP TABLE explain analyze select sum(qty) from t1;	329.562 ms 902.614 ms
— COLUMNAR TABLE explain analyze select sum(qty) from t2;	329.562 ms 902.614 ms
— HEAP TABLE explain analyze select id,sum(qty) from t1 group by id;	531.525 ms 1602.756 ms
— COLUMNAR TABLE explain analyze select id,sum(qty) from t2 group by id;	531.525 ms 1602.756 ms

Part two (five million records, 100 column tables)

In order to get a better idea of performance differences, a second set of tables at a greater scale were created. However, this time, while the number of records was halved, the number of columns was increased from two to one hundred.

Even if most of the columns are simply copies of one another, the columnar table’s resultant compression is remarkable as the default size is reduced by a factor of 752X.

/* TABLES Table "public.t[34]" Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+---------------------------------------------------------------------------- c1 | bigint | | not null | nextval('t1_c1_seq'::regclass) c2 | bigint | | | c3 | text | | | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text c4 | text | | | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
.
.
. c98 | text | | | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text c99 | text | | | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text c100 | text | | | 'aowjfa fawjfawofjawofjawoifawevvaerarpfjkaofvaweawe[OJARGOIAJOAFWF'::text
Indexes: "t1_pkey" PRIMARY KEY, btree (c1) "t1_c2_idx" btree (c2) List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+------+-------+----------+-------------+---------------+-------+------------- public | t3 | table | postgres | permanent | heap | 67 GB | public | t4 | table | postgres | permanent | columnar | 89 MB |
*/

Examining the indexes one sees them to be the same size.

List of relations
Schema | Name | Type | Owner | Table | Persistence | Access method | Size |
-------+---------------+-------+----------+-------+-------------+---------------+--------+
public | t3_c2_idx | index | postgres | t3 | permanent | btree | 105 MB | public | t3_c2_idx1 | index | postgres | t3 | permanent | btree | 105 MB | public | t3_length_idx | index | postgres | t3 | permanent | btree | 33 MB | public | t3_pkey | index | postgres | t3 | permanent | btree | 107 MB | public | t4_c2_idx | index | postgres | t4 | permanent | btree | 105 MB | public | t4_length_idx | index | postgres | t4 | permanent | btree | 33 MB | public | t4_pkey | index | postgres | t4 | permanent | btree | 107 MB |

Unlike the first set of query plans, these ones clearly demonstrate a significant performance improvement.

Curious to see what would change in the way of performance, the varying the max_parallel_workers_per_gather doesn’t appear to have changed much.

SQL	Timings max parallel workers 4 1
— HEAP TABLE without index explain analyze select sum(c2) from t3;	9.6s 590.176ms	8.7s 596.459ms
— COLUMNAR TABLE without index explain analyze select sum(c2) from t4;
— HEAP TABLE explain analyze select count(c3) from t3;	10.4s 509.209 ms	8.8s 541.452ms
— COLUMNAR TABLE explain analyze select count(c3) from t4;
— HEAP TABLE explain analyze select max(length(c25)) from t3;	1m34s 1.1s	1m17s 1.2s
— COLUMNAR TABLE explain analyze select max(length(c25)) from t4;
— HEAP TABLE explain analyze select sum(length(c50)) from t3;	1m33s 1.2s	1m18s 1.2s
— COLUMNAR TABLE explain analyze select sum(length(c50)) from t4;

Working with indexes

General observations: Btree indexes are similar in size between HEAP and COLUMNAR tables. Overall their performance also appears similar although the columnar table’s index is somewhat slower, no doubt due to the extra processing required to uncompress the table’s values.

Regarding Expression Indexes: Creating an EXPRESSION index on COLUMNAR table is significantly faster than that of HEAP:

-- 1m17s
create index on t3(length(c90));
-- 14s
create index on t4(length(c90));

Regarding Runtime Parameter max parallel workers: Index performance varies considerably on HEAP tables depending upon the value set to max parallel workers.

The following results highlight that, depending upon the type of table used, it can become important when hardware resource and server costs are a consideration:

SQL	Timings max parallel workers 4 1
— HEAP TABLE using BTREE index explain analyze select sum(c2) from t3;	467.789ms 561.522 ms	748.939ms 599.629ms
— COLUMNAR TABLE using BTREE index explain analyze select sum(c2) from t4;
— HEAP TABLE using EXPRESSION index explain analyze select max(length(c90)) from t3;	1.614ms 31.980ms	2.346ms 38.766ms
— COLUMNAR TABLE using EXPRESSION index explain analyze select max(length(c90)) from t4;

About table constraints and access methods

Overall, indexes, constraints, and access methods are still evolving with many of the features still to be implemented.

Let’s start with a big issue i.e. DELETE and UPDATE:

-- fails
delete from t2 where id=5; ERROR: UPDATE and CTID scans not supported for ColumnarScan -- fails
update t2 set id=5; ERROR: UPDATE and CTID scans not supported for ColumnarScan

Creating indexes on a columnar table is restricted to btree indexes:

-- works
create index on t2 using btree (id); -- fails
create index on t2 using columnar (id);
ERROR: unsupported access method for the index on columnar table t2

Creating foreign key constraints aren’t implemented:

select generate_series as id into t3 from generate_series(0,15);
alter table t3 add primary key(id); -- works for our standard table t1
alter table t1 add foreign key(id) references t3(id); -- fails with the columnar table t2
alter table t2 add foreign key(id) references t3(generate_series);
ERROR: Foreign keys and AFTER ROW triggers are not supported for columnar tables
HINT: Consider an AFTER STATEMENT trigger instead. --works after converting table t1 from COLUMNAR to HEAP
alter table t2 set access method heap;
alter table t2 add foreign key(id) references t3(generate_series);
alter table t2 set access method columnar;

Partitioning

Columnar tables can be used as partitions; a partitioned table can be made up of any combination of row and columnar partitions.

An excellent use case is INSERT once and READ only table partitions where one can leverage both its compression and better performing OLAP type queries for very large tables.

Caveat

Columnar Extension Limitations, as of version 11.1:

It takes more time to create the table than standard heap access-based tables
The query performance is equal or slower with smallish tables compared to heap-based tables.
There is no update/delete possible in a columnar table.
The indexes are limited to btree, as of version 10.0.
There is no logical replication.

There’s actually more documented. Refer here for more information.

AUTHOR’S NOTE: In regard to the custom packages created for this blog. The entire citus suite is designed to enhance Postgres as a distributed database solution. Only the columnar table component was covered in this blog. The complete feature set is part of this package install and should work but has not been tested.

Conclusion

Despite its current limitations, there are use cases where this extension can definitely make a difference. And it speaks well of its future as the team continues development by constantly improving and adding capabilities. Watch for updates on its GitHub source repository.

Categories