PostgreSQL and Us

How We Use PostgreSQL, What Works, and What's Next

Christopher R. Bilger

March 11th, 2025

Agenda

How We Use PostgreSQL
Pros and Cons of Our Current Approach
Best Practices
Recommended Next Steps
Conclusion + Q&A

How We Use PostgreSQL

Note: This presentation revolves around our main product database; however 90%+ is true for all of our PostgreSQL databases.

Overview of Our Database Infrastructure

Managed PostgreSQL service via AWS RDS (Relational Database Service)
Single instance primary with a failover read replica
- Subject to change once we have an RDS Proxy in place
Database migrations handled via the talkiatry-app Git repository
- Applied using the knex NPM package

Key Applications and Workloads

Source of truth for all pre-patient (a.k.a. intake) data
Supports all CRUD operations from the intake assessment and Synapse
Interfaces directly with all running cronjobs (Argo Workflows)
Used for reporting and analytics via regular sync's to Snowflake
Also regularly sync's new data from external sources
- Snowflake
- eCW
- Sohar
- etc.

Scale and Performance Considerations

Using PostgreSQL's native functions, we can easily find our database's size when stored on disk. This includes indices and other relations which increase disk space beyond that of the data itself.

              
SELECT pg_size_pretty(pg_database_size('<dbname>'));

Scale and Performance Considerations

As of today, our current size on disk is ~62 GB. This includes all data, indices, and other relations.

Size Category	Range
Very Small	<1 GB
Small	1 GB - 100 GB
Medium	100 GB - 1 TB
Large	1 TB - 10 TB
Very Large	>10 TB

Pros of Our Current Approach

Managed service with minimal maintenance
Easy to scale up and out
Highly reliable and secure
- Regular patches and updates made available by AWS
- Automated backups and point-in-time recovery
Compatible with a wide range of tools and libraries
Excellent support for complex queries and data types
- JSONB, arrays, and custom types
- We recently added PostGIS support for geospatial data

Cons & Challenges

Performance bottlenecks
Operational pain points
Lessons learned from past experiences

Performance Bottlenecks

Slow queries
- Missing or poorly optimized indices
- Suboptimal query patterns
Long-running transactions
- Lock contention and deadlocks
- Resource constraints and timeouts
Materialized view and synchronization issues
- Performance overhead and maintenance costs
- Consistency and staleness concerns

Operational Pain Points

Database migrations and schema changes
- Rolling updates and backward compatibility
- Automated testing and validation
Monitoring and alerting practices
- Performance metrics and resource utilization
- Query profiling and optimization

Lessons Learned from Past Experiences

Reduce long-running transactions
- Generally occurs in cronjobs; minimize both temporal and computation per transaction loop
Optimize queries and indices
- Use EXPLAIN ANALYZE to identify bottlenecks
- Regularly review and update our database schema
- Add indices on high-usage queries and/or queries that are part of a critical path
  - e.g. WHERE, JOIN, ORDER BY, GROUP BY, etc.
Synchronizing to external data sources (e.g. OpenSearch) causes performance issues
- Consider using a change data capture (CDC) tool
- Or, use a more efficient synchronization mechanism if needed at all

Best Practices We Follow

Database schema design and normalization
Connection pooling and resource management
Fairly standardized schema and query patterns

Example: Non-Optimized Query

              
SELECT
  *
FROM
  users
WHERE
  created_at >= '2025-01-01'
  AND created_at < '2025-02-01'
  AND status = 'active';

Example: Well-Optimized Query

              
CREATE INDEX idx_users_created_at_status
ON users (created_at, status);

SELECT
  *
FROM
  users
WHERE
  created_at >= '2025-01-01'
  AND created_at < '2025-02-01'
  AND status = 'active';

Best Practices We Should Adopt

Use EXPLAIN ANALYZE to identify bottlenecks
Use tools such as pganalyze
- Identify slow queries and missing indices
- Optimize query patterns and database schema
Perform as much as possible, if not all, query filtering, joins, and ordering in the database
- Minimize data transfer between the database and application
- Minimize computational load on the application servers

What is EXPLAIN ANALYZE?

EXPLAIN ANALYZE are a set of two PostgreSQL options that can be used to obtain a query execution plan and the actual execution time of each node in the plan.

EXPLAIN will show the query plan, while ANALYZE will actually execute the query and show the actual execution time.

Example: EXPLAIN ANALYZE

              
CREATE TABLE employees (
  id SERIAL PRIMARY KEY,
  name VARCHAR(100),
  department VARCHAR(50),
  salary INT
);

SELECT * FROM employees WHERE department = 'Engineering';

Example: EXPLAIN ANALYZE

              
EXPLAIN ANALYZE SELECT * FROM employees WHERE department = 'Engineering';

Seq Scan on employees  (cost=0.00..15.35 rows=5 width=100) (actual time=0.015..0.018 rows=3 loops=1)
  Filter: (department = 'Engineering'::text)
  Rows Removed by Filter: 97
Planning Time: 0.125 ms
Execution Time: 0.057 ms

Example: EXPLAIN ANALYZE

Using the information returned from EXPLAIN ANALYZE, we can see that the query is performing a sequential scan on the employees table. This is not ideal for performance as it is scanning the entire table.

              
CREATE INDEX idx_department ON employees(department);

Recommended Next Steps

Performance metric monitoring and tooling
Database indices / optimizations for easy wins
Training, monitoring, or tooling recommendations for enhancing our practice
Making use of an AWS RDS Proxy

PostgreSQL Extensions to Consider

pg_stat_statements for query statistics
- Already in use
pg_cron for cronjobs
- Currently using Argo Workflows
- May be useful for simpler tasks; although the added complexity would more or less render the use of this extension moot
pg_repack for table maintenance
- Reduces bloat and improves performance
- May be useful for large tables

Conclusion + Q&A

How We Use PostgreSQL
Pros and Cons of Our Current Approach
Best Practices
Recommended Next Steps
Q&A

Powered By

A single, static webpage.
reveal.js
highlight.js