PostgreSQL 18.3 (released Feb 26, 2026) keeps the same multi-process, shared-memory architecture it has carried for decades. PG 18 adds two meaningful pieces on top of that foundation: I/O Workers (async read pool) and AIO Submission Rings in shared memory. Everything else is the same model, just better tuned.
Here’s a map of all the moving parts.
Architecture at a glance
Scroll horizontally on small screens to see the full diagram. Components marked PG 18 are new in PostgreSQL 18.
Tip: Switch to light theme for a better reading experience on this diagram.
Query lifecycle
How a query travels through the system, end to end:
- Client connects over TCP
:5432or a unix socket using the libpq wire protocol - Postmaster authenticates against
pg_hba.conf, thenfork()s a dedicated backend process - Backend runs the query pipeline — Parser → Analyzer/Rewriter → Planner/Optimizer → Executor
- Executor hits Shared Buffers — on a cache miss, I/O Workers fetch the page asynchronously (PG 18)
- WAL record written first — every modification lands in WAL Buffers before the page is dirtied
- WAL Writer flushes
pg_wal/on commit and on interval, guaranteeing durability - Checkpointer + Background Writer flush dirty pages to
base/on checkpoint
Clients
Any driver speaking PostgreSQL’s wire protocol. Common ones:
| Driver | Language |
|---|---|
psql / libpq | C / shell |
pgjdbc | Java |
psycopg3 | Python |
asyncpg | Python (async) |
pgx | Go |
node-postgres | Node.js |
SQLx | Rust |
All connect over TCP port 5432 (not 5433 — that’s a second cluster) or a unix socket.
Postmaster — the supervisor
The original postgres process. It:
- Reads
pg_hba.confand handles auth for every incoming connection fork()s a dedicated backend process per connection- Spawns and restarts every auxiliary background process
If a backend crashes, the postmaster recovers just that connection. If an auxiliary process crashes, the postmaster restarts it — or initiates a full cluster restart if the damage is to shared memory.
Backend — the query pipeline
One process per client connection. SQL flows through four stages:
- Parser — tokenizes SQL text and produces a parse tree
- Analyzer / Rewriter — resolves object names against the catalog, expands views and rules
- Planner / Optimizer — cost-based plan selection using
pg_statisticand GEQO for large join sets - Executor — walks the plan tree and returns tuples; may spawn parallel workers up to
max_parallel_workers_per_gather
The backend never touches disk directly for normal query execution. All reads and writes go through Shared Buffers.
Shared Memory
Allocated by the Postmaster at startup. Every backend attaches to the same region.
Shared Buffers
PostgreSQL’s page cache — heap pages and index pages all live here.
shared_buffers = 128MB # default; rule of thumb is ~25% of system RAM
WAL Buffers
An in-memory ring of WAL records waiting to be flushed.
wal_buffers = 16MB # auto-tuned based on shared_buffers
CLOG · pg_xact cache
Commit/abort status for every transaction ID. The on-disk version lives in pg_xact/.
ProcArray
Tracks all active transactions and their snapshots. This is how MVCC visibility decisions are made — every query consults ProcArray to determine which row versions are visible.
Lock Tables
Stores heavyweight locks (table, row, advisory) and LWLocks (lightweight spin locks used internally).
SLRU Caches
Segmented LRU buffers for subtransaction data, multixact, async NOTIFY, and serializable transaction tracking.
AIO Submission Rings ★ PG 18
New in PG 18. Per-backend rings that queue async read requests to the I/O Workers pool. Allocated by the Postmaster so they are lock-free once established. This is the in-memory side of the new async I/O subsystem.
Cumulative Stats
Since PG 15, stats live directly in shared memory — no separate stats collector process, no UDP socket. The “stats collector” box that appeared on every legacy architecture diagram is gone.
Background Processes
All spawned and supervised by the Postmaster. They all read from and write to PGDATA.
Checkpointer
Flushes all dirty pages from Shared Buffers to disk at each checkpoint. Checkpoints are triggered by checkpoint_timeout (default 5 min) or when WAL reaches max_wal_size. This bounds crash recovery time — on restart, PostgreSQL only needs to replay WAL back to the last checkpoint.
Background Writer
Trickles dirty pages to disk continuously between checkpoints, so backends can always evict a clean buffer without having to do it themselves. This smooths out the I/O curve and prevents the checkpoint from being a cliff.
WAL Writer
Flushes WAL Buffers to pg_wal/ on commit and periodically between commits. This is what makes durability work — once a WAL record hits disk, the transaction is durable even if the data page hasn’t been written yet.
Autovacuum Launcher
Spawns per-database worker processes to run VACUUM, ANALYZE, and transaction ID freeze. Critical for MVCC health. Bloat and xid wraparound are both real failure modes — autovacuum is not optional in production.
I/O Workers ★ PG 18
The new async read pool. When a backend misses in Shared Buffers, instead of blocking on the read itself, it hands the request to an I/O Worker via the AIO Submission Ring.
io_method = worker # default, cross-platform
io_uring # Linux 5.1+, lowest syscall overhead
sync # PG 17 behavior, blocks the backend
io_workers = 3 # default
Early benchmarks show 2–3× read throughput improvement on cloud storage with io_uring. The worker method is the safe default — same benefit without the kernel version requirement.
Logical Replication Launcher
Starts apply worker processes for each CREATE SUBSCRIPTION. These workers decode and replay logical changes from WAL Senders on the upstream.
Archiver
When archive_mode = on, copies completed WAL segments to wherever archive_command points — S3, NFS, a remote host. Required for PITR (Point-in-Time Recovery) and base-backup-based replicas.
WAL Sender(s)
One process per connected replica or logical subscriber. Reads WAL from pg_wal/ and streams it over the replication connection.
WAL Receiver
Runs on the standby. Receives the WAL stream from the primary’s WAL Sender and hands records to the Startup Process for replay.
Startup Process
Replays WAL after a crash (runs once, exits when caught up) or continuously on a standby (keeps replaying as new WAL arrives). This is the same code path for both crash recovery and streaming replication.
Logger (syslogger)
Collects stderr from all processes and writes it to log files when logging_collector = on. Without it, each process writes to stderr directly.
PGDATA — on-disk layout
$PGDATA/
├── base/ # heap and index files — 8 KB pages
├── pg_wal/ # WAL segments — 16 MB each
├── pg_xact/ # commit log on disk (CLOG)
├── pg_tblspc/ # symlinks to tablespace directories
├── pg_logical/ # logical decoding state and snapshots
├── pg_replslot/ # replication slot state
├── pg_multixact/ # multi-transaction data (row-level locking)
├── pg_stat/ # persistent stats written at shutdown
├── postgresql.conf # GUC settings
├── pg_hba.conf # client auth rules
├── pg_ident.conf # ident mapping
└── postmaster.pid # PID file, removed on clean shutdown
Tablespaces let you put base/ data on a different volume by adding a symlink under pg_tblspc/. Everything else stays under $PGDATA.
Archive (off-cluster)
Completed WAL segments are copied out by the Archiver to whatever archive_command points to — typically S3 or NFS. This off-cluster copy is what enables PITR and streaming-replica bootstrapping from a base backup.
What’s new in PG 18
| Feature | Where | Details |
|---|---|---|
| I/O Workers | Background Processes | Async read pool; replaces blocking per-backend reads |
| AIO Submission Rings | Shared Memory | Lock-free per-backend queues to I/O Workers |
io_method = io_uring | GUC | Linux 5.1+; lowest-overhead async I/O path |
The cumulative stats in shared memory (noted above) landed in PG 15 — it’s not new in 18, but it’s still missing from a lot of architecture diagrams floating around.