Production Checklist
Before launching pgStack in production, review the following checklist.
Security
Secrets
-
POSTGRES_PASSWORDis a strong random value generated withopenssl rand -base64 32 -
JWT_SECRETis 32+ characters, generated randomly -
JWT_SECRETdoes not containdev-only— withPGSTACK_ENV=production(set bydocker-compose.prod.yml) the proxy refuses to start if it does; outside production it only warns -
ANON_KEYis a strong random value -
SERVICE_ROLE_KEYis a strong random value — treat it like a database root password -
.envfile is in.gitignoreand never committed to version control -
SITE_URLis set to your production URL (required for OAuth redirects and email links) - No development defaults (
dev-only-*) remain in your production environment
Network
- PostgreSQL port
5432is NOT exposed to the internet (only accessible within Docker network) - pgStack proxy port
8080is behind a reverse proxy (Caddy, nginx, Traefik) - HTTPS is enforced — HTTP redirects to HTTPS
-
ALLOWED_ORIGINSis set to your specific domain(s), not empty -
ALLOW_PERMISSIVE_CORSisfalseor unset (strict CORS by default since 2026-05) - Studio (
/studio/) is restricted to trusted IPs or protected with additional auth - Configure
TLS_MODE=autofor automatic HTTPS, or use a reverse proxy withTLS_MODE=off(default) -
TRUSTED_PROXIESis set to the IP(s) of your reverse proxy — prevents rate limit bypass viaX-Forwarded-Forspoofing - HSTS is automatically added when TLS is active (via
TLS_MODE=autoorTLS_MODE=manual) - WebSocket connections validate Origin header against
ALLOWED_ORIGINS -
REQUIRE_AUTHENTICATED_WS=true— required in production: thePGSTACK_ENV=productionvalidator refuses to start withfalse, anddocker-compose.prod.ymlsets it totrue(rejects anon-key WS subscriptions; requires JWT)
Database
-
POSTGRES_HOST_AUTH_METHODisscram-sha-256(nottrust) -
ENABLE_SQL_ENDPOINTisfalse -
ENABLE_DEMO_ENDPOINTisfalseor unset -
SECRETS_ENCRYPTION_KEYis set to a strong random 32-byte value if any of these are in use: edge function env vars, webhook signing secrets. Without it,CreateWebhookrefuses non-empty secrets and existing encrypted rows are dropped from the delivery cache (fail-closed). - Existing deployments have run
migrations/20260515210000_hash_tokens_encrypt_envvars.sql(refresh + email tokens are stored as SHA-256 hashes; env vars are AES-256-GCM encrypted) andmigrations/20260516120000_encrypt_webhook_secrets.sql(webhook signing secrets moved from plaintext tosecret_encrypted— webhooks must be re-created with new secrets after the migration) - RLS is enabled on all user-facing tables
-
anonandauthenticatedroles have only the minimum necessary permissions - Internal auth tables (
pgstack.refresh_tokens,pgstack.oauth_sessions,pgstack.email_tokens) have RLS enabled
Reliability
Backups
- Automated daily backups configured (pg_dump to S3, Backblaze B2, etc.)
- Backup restoration tested at least once
- Backup retention policy defined (e.g., keep 7 daily, 4 weekly)
Health checks
The production compose file includes health checks. Verify they work:
docker compose -f docker-compose.prod.yml ps
# Both services should show "healthy"
# Proxy health endpoint
curl https://your-app.example.com/health
# {"status":"ok","pg_connected":true}
Container restart policy
The production compose file sets restart: unless-stopped. Verify containers automatically restart after a reboot:
# Reboot test
sudo reboot
# After reboot:
docker compose -f docker-compose.prod.yml ps
Subscription recovery after a PostgreSQL restart
Live-query state lives in shared memory and is wiped by every PostgreSQL restart (crash, docker compose restart, host reboot, image update). The extension keeps a durable catalog — pgr.persisted_subscriptions, a LOGGED table maintained automatically by pgr.subscribe / pgr.unsubscribe — which is replayed into shared memory by pgr.restore_subscriptions().
This is automatic. Both the all-in-one image and the two-container pgstack-postgres image run the restore on every boot (their entrypoint waits for PostgreSQL to accept connections, then calls pgr.restore_subscriptions() for each database with the extension installed, before the stack starts serving). You do not need a manual restore step.
If you run PostgreSQL with a custom entrypoint that does not include this step, replay it yourself after every restart:
docker compose -f docker-compose.prod.yml exec db \
psql -U postgres -d app -c "SELECT pgr.restore_subscriptions();"
The function returns the number of subscriptions restored and has EXECUTE revoked from PUBLIC (superuser or an explicitly granted role only).
Resource limits
- Memory limits are appropriate for your workload (see
deploy.resources.limits) -
max_connectionsin PostgreSQL matches your expected connection count -
pg_reactive.max_subscriptionsis appropriate for your WebSocket client count
Performance
PostgreSQL configuration
For a server with 4 GB RAM:
command: >
postgres
-c shared_preload_libraries=pg_reactive
-c shared_buffers=1GB
-c effective_cache_size=3GB
-c work_mem=16MB
-c maintenance_work_mem=256MB
-c max_connections=200
-c pg_reactive.max_subscriptions=2048
-c log_min_messages=warning
-c log_statement=none
-c random_page_cost=1.1
-c checkpoint_completion_target=0.9
-c wal_buffers=64MB
Indexes
Ensure all frequently queried columns are indexed, especially columns used in RLS policies and live query filters:
-- Index columns used in RLS policies
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_todos_user_id ON todos(user_id);
-- Index columns used in live query WHERE clauses
CREATE INDEX idx_orders_status ON orders(status) WHERE status != 'completed';
Connection pooling
For high traffic, add PgBouncer in front of PostgreSQL:
# Add to docker-compose.prod.yml
pgbouncer:
image: bitnami/pgbouncer:latest
environment:
PGBOUNCER_DATABASE: app
POSTGRESQL_HOST: db
POSTGRESQL_USERNAME: postgres
POSTGRESQL_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRESQL_DATABASE: app
PGBOUNCER_POOL_MODE: transaction
PGBOUNCER_MAX_CLIENT_CONN: 1000
PGBOUNCER_DEFAULT_POOL_SIZE: 20
ports:
- "127.0.0.1:6432:6432"
Then point DATABASE_URL at PgBouncer instead of PostgreSQL directly.
Tune the proxy's built-in connection pool before reaching for PgBouncer:
-
PG_POOL_MAX_CONNS(default 50) — raise for high-concurrency workloads -
PG_POOL_MIN_CONNS(default 5) — keeps connections warm to avoid cold-start latency
For advanced pooling (transaction mode), add PgBouncer between the proxy and PostgreSQL as shown above.
Proxy configuration
- Compression — Gzip compression is enabled by default (
GZIP_LEVEL=5). Set to0to disable if your reverse proxy handles compression. - Timeouts — Adjust
READ_TIMEOUT(default 30s) andWRITE_TIMEOUT(default 60s) based on expected query complexity. Long-running queries may need higher write timeouts.
Monitoring
Log aggregation
Forward Docker logs to a log aggregation service:
# In docker-compose.prod.yml, add to each service:
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
Or use the fluentd or loki Docker log drivers to ship logs to your logging stack.
Metrics
pgStack has no built-in /metrics endpoint. /health (status + pg_connected) is the only built-in observability endpoint; scrape it or use Docker/PostgreSQL exporters for metrics.
Alerting
Set up alerts for:
- Container health check failures
- PostgreSQL disk usage > 80%
- Slow queries (> 1 second average)
- Error rate spikes in proxy logs
Maintenance
PostgreSQL VACUUM
PostgreSQL autovacuum handles routine maintenance automatically. For tables with high churn (frequent inserts/updates/deletes), monitor bloat:
SELECT
relname AS table,
n_dead_tup AS dead_rows,
n_live_tup AS live_rows,
round(n_dead_tup::numeric / NULLIF(n_live_tup, 0) * 100, 1) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY dead_pct DESC;
If bloat is significant, run VACUUM manually during low-traffic periods:
VACUUM ANALYZE orders;
pg_reactive snapshot tables
Live query snapshot tables (pgr._snap_*) are UNLOGGED — ordinary on-disk tables that skip WAL. They are created on subscribe and dropped on unsubscribe (or LRU eviction / DDL auto-unsubscribe) and require no maintenance.
Upgrade procedure
Apply schema migrations to the database before the new proxy starts — a new
proxy may depend on schema or roles a migration creates, and an existing data
volume never re-runs docker/init.sql. Bring up the database alone first, then
migrate, then swap the rest:
- Review the changelog for breaking changes
- Test the upgrade in a staging environment
- Take a database backup
- Pull new images:
docker compose -f docker-compose.prod.yml pull - Start only the database on the new image:
docker compose -f docker-compose.prod.yml up -d db - Run new migrations against it:
pgstack migration run - Apply any one-time role/credential steps the changelog calls out (see below)
- Bring up the rest:
docker compose -f docker-compose.prod.yml up -d - Verify the health endpoint and spot-check API responses
One-time: upgrading to the non-superuser proxy connection
Releases that move the proxy off the PostgreSQL superuser onto the authenticator
role (gemini audit P2) need an extra step on an existing volume — init.sql
and docker/20-authenticator-login.sh create and credential that role only at
first init, so they never fire on a populated volume:
# After step 6 (migration 20260613120000 created the authenticator role NOLOGIN
# plus the service_role grants), give it the login password the proxy will use.
# It MUST equal AUTHENTICATOR_PASSWORD in your .env.
docker compose -f docker-compose.prod.yml exec -T db \
psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" \
-v apw="$AUTHENTICATOR_PASSWORD" -c "ALTER ROLE authenticator LOGIN PASSWORD :'apw'"
Only then bring up the proxy (step 8) — its DATABASE_URL now points at
authenticator, which finally exists and can log in. Skipping this leaves the
proxy unable to connect.
Two-login-role split (admin plane)
To also split the admin plane onto its own login role (so a data-plane
compromise cannot reach service_role — gemini audit finding 6), after the
above and after migration 20260613140000 created authenticator_admin:
# 1. Give the admin role a login password (= AUTHENTICATOR_ADMIN_PASSWORD in .env)
docker compose -f docker-compose.prod.yml exec -T db \
psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" \
-v apw="$AUTHENTICATOR_ADMIN_PASSWORD" -c "ALTER ROLE authenticator_admin LOGIN PASSWORD :'apw'"
# 2. Set ADMIN_DATABASE_URL=postgres://authenticator_admin:<pw>@db:5432/<db> in .env,
# then recreate the proxy so its admin pool uses it.
# 3. ONLY THEN drop the data role's escalation path:
docker compose -f docker-compose.prod.yml exec -T db \
psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c "REVOKE service_role FROM authenticator"
Doing step 3 before step 2 breaks the admin pool (it would still be assuming
service_role through authenticator).