- Newest
- Most votes
- Most comments
PostgreSQL in local zone provides performance at edge to latency sensitive workloads. This in meant to be used only in situations RDS managed services is not available like a Local Zone.
Identifying the instance type for PostgreSQL
Database, from AWS observations, have specific pattern of resource consumption. We find picking right instance type significantly reduce existence of bottlenecks in workload. AWS has 3 popular lines of EC2 instances; c instance with vCore:GB/of RAM ratio 1:2, m instance with ratio 1:4 and r instance with ratio 1:8. r5d.2xlarge is preferred EC2 instance type for typical customer production deployment. r instance are useful for database workloads because large disk caches saves disk activities because storage pages can be cached on RAM. While c and m instance could be relevant for specific workloads, for database workloads r instance is recommended default.
Operating System Networking
In a high velocity transactions system the networking activity starts to also influence performance. The TCP/IP stack behaviour can influence the database performance as a choke hold. These settings in sysctl makes TCP/IP stack more suitable to high performance database;
// /etc/sysctl.conf
net.ipv4.tcp_sack=1
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_wmem = 4096 65536 4194304
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_tw_recycle = 1
net.core.wmem_max = 16777216
net.core.rmem_max = 16777216
net.core.wmem_default = 16777216
net.core.rmem_default = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 262144
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
The configuration is pushed into the kernel using sudo sysctl -p
We can also reduce network overheads by removing iptables because Operating System does not need to support firewall given Security Groups provide the requisite service.
# /etc/modprobe.d/blacklist.conf
2 blacklist ip_tables
followed by sudo reboot
Setting PostgreSQL as the highest priority system
The Linux processes have run-time priority with ranges from lowest to highest 19 to -20. In a single concern system, defining priority of main process is a good idea. In this case we are planning to run PostgreSQL as the main concern. This can be done by adding Nice parameter in systemd file of PostgreSQL
# /etc/systemd/system/multi-user.target.wants/postgresql.service
# systemd service for managing all PostgreSQL clusters on the system. This
# service is actually a systemd target, but we are using a service since
# targets cannot be reloaded.
[Unit]
Description=PostgreSQL RDBMS
[Service]
Type=oneshot
Nice=-20
ExecStart=/bin/true
ExecReload=/bin/true
RemainAfterExit=on
Using separate volume for storing pg stats
PostgreSQL stores in statistics in a specific high activity directory. These statistics are telemetry for the PostgreSQL. Storing this activity in separate volume saves valuable IOPS on core workload of database tables. This activity is 2 step process; Firstly add a volume as candidate directory for storing the data. In this case we use RAM storage.
# Add this line to /etc/fstab
tmpfs /var/lib/pg_stats tmpfs size=1G,nosuid,nodev,noatime 0 0
Followed by adding /var/lib/pg stats/14-main.pg stat tmp as parameter for PostgreSQL variable stats temp directory. We will see entire set of changes in PostgreSQL.
PGTune for your Local Zone EC2 Instance
PGTune is a neat little tool for generating configuration changes for operating hyperscale PostgreSQL. We use the parameters;
Parameter | Value |
---|---|
DB Version | 14 |
OS Type | Linux |
DB Type | Online Transaction Processing System |
Total Memory | 30 GB |
CPU | 4 |
Number of Connections | 100 |
Storage | SSD |
PGTune issues this configuration
Add these lines to /etc/postgresql/14/main/postgresql.conf 2
# DB Version: 14
# OS Type: linux
# DB Type: oltp
# Total Memory (RAM): 30 GB
# CPUs num: 4
# Connections num: 100
# Data Storage: ssd
max_connections = 100
shared_buffers = 7680MB
effective_cache_size = 23040MB
maintenance_work_mem = 1920MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 39321kB
min_wal_size = 2GB
max_wal_size = 8GB
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
max_parallel_maintenance_workers = 2
# For pg_stats we also add
stats_temp_directory = ’/var/lib/pg_stats/14-main.pg_stat_tmp’
Conclusion
These changes produce dramatically enhanced OLTP read-write performance for EC2 in Local Zone PostgreSQL setup. These characteristics place Local Zone EC2 self-managed PostgreSQL as a compelling performance alternative to RDS if customers requirements are such. It still does not out of the box provide resilience characteristics and customers is expected to manage the risks involved with self-managed installations.
Relevant content
- asked 2 years ago
- asked a year ago
- asked 2 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 6 months ago