- Newest
- Most votes
- Most comments
Han,
I'll try to be brief in my response, but feel free to ask me if you have any questions.
-
Relationship between
temp_bytesPeaks andFreeLocalStorageDrops:- The y-value of
temp_bytespeaks represents the temporary storage used by the queries. The amount of drop inFreeLocalStoragecorrelates with the space consumed by these temporary files. The moretemp_bytesused, the larger the drop inFreeLocalStorage.
- The y-value of
-
Why
temp_bytesPeaks at Query End:temp_bytespeaks at the end of the query because the temporary files are created and possibly expanded throughout the query execution but are only fully accounted for in the metrics when the query completes.
-
Doubling of
temp_bytesandFreeLocalStorageDrops:- Yes, it is likely that the doubling of the
temp_bytespeak and the corresponding drop inFreeLocalStorageis due to the second entity having double the number of rows. More rows mean more temporary storage usage.
- Yes, it is likely that the doubling of the
-
Value of Knowing
temp_bytesandFreeLocalStorage:- Monitoring these metrics helps in understanding the temporary storage requirements of your queries. High
temp_bytesspikes or significant drops inFreeLocalStoragecan indicate that your queries are using more temporary storage than expected. Iftemp_bytesspikes are too high orFreeLocalStoragedrops are too low, it might indicate the need for optimizing queries or increasing instance size.
- Monitoring these metrics helps in understanding the temporary storage requirements of your queries. High
-
Is
FreeLocalStoragethe PostgreSQL Temp Directory on Disk?- Yes,
FreeLocalStoragerepresents the available space in the temporary storage directory on disk. When you encounter "No space left on device" errors, increasing the instance type to get higherFreeLocalStorageis a viable solution, alongside adjustingwork_mem.
- Yes,
-
Predicting
temp_bytesorFreeLocalStoragefrom a Query Plan:- Predicting the exact amount of
temp_bytesorFreeLocalStoragerequired from a query plan can be complex. However, you can estimate based on the number of rows processed and the operations involved (e.g., sorting, hashing). If the correlation between the number of rows andtemp_bytesholds, you can use metrics from similar queries to make predictions.
- Predicting the exact amount of
Bests.
There are also some answers (incomplete though) on this in the thread below https://aws.amazon.com/blogs/database/tune-sorting-operations-in-postgresql-with-work_mem/
My issue is with tuning guide at https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.ManagingTempFiles.html recommending to turn on temp_bytes, but no explanation of what it means or how it helps choose instance size.
Quoting the same point i made in that thread "I'm trying to relate what we see from the query plan to temp_bytes. Is that possible? If no, then it's not a useful indicator, and the linked article in my 1st question recommending to turn temp_bytes on would not help query writing or instance sizing"
I will ignore temp_bytes for now and just observe FreeLocalStorage which relate strongly to number of rows scanned in query plan and the sum(size) column from
select CURRENT_TIMESTAMP, count(*), sum(size) from pg_ls_tmpdir();
The query above will show temp files being created in groups of (parallel workers+ 1 leader) files, growing to a size of 1GB before the next group of temp files are created. Meaning if max_parallel_workers_per_gather = 2, 3 files are created at once, filled to 1GB each at roughly uniform speed, then another set of 3 temp files appear, again filled to 1GB each, and so on
Relevant content
- asked 2 years ago
