- Newest
- Most votes
- Most comments
Han,
I'll try to be brief in my response, but feel free to ask me if you have any questions.
-
Relationship between
temp_bytes
Peaks andFreeLocalStorage
Drops:- The y-value of
temp_bytes
peaks represents the temporary storage used by the queries. The amount of drop inFreeLocalStorage
correlates with the space consumed by these temporary files. The moretemp_bytes
used, the larger the drop inFreeLocalStorage
.
- The y-value of
-
Why
temp_bytes
Peaks at Query End:temp_bytes
peaks at the end of the query because the temporary files are created and possibly expanded throughout the query execution but are only fully accounted for in the metrics when the query completes.
-
Doubling of
temp_bytes
andFreeLocalStorage
Drops:- Yes, it is likely that the doubling of the
temp_bytes
peak and the corresponding drop inFreeLocalStorage
is due to the second entity having double the number of rows. More rows mean more temporary storage usage.
- Yes, it is likely that the doubling of the
-
Value of Knowing
temp_bytes
andFreeLocalStorage
:- Monitoring these metrics helps in understanding the temporary storage requirements of your queries. High
temp_bytes
spikes or significant drops inFreeLocalStorage
can indicate that your queries are using more temporary storage than expected. Iftemp_bytes
spikes are too high orFreeLocalStorage
drops are too low, it might indicate the need for optimizing queries or increasing instance size.
- Monitoring these metrics helps in understanding the temporary storage requirements of your queries. High
-
Is
FreeLocalStorage
the PostgreSQL Temp Directory on Disk?- Yes,
FreeLocalStorage
represents the available space in the temporary storage directory on disk. When you encounter "No space left on device" errors, increasing the instance type to get higherFreeLocalStorage
is a viable solution, alongside adjustingwork_mem
.
- Yes,
-
Predicting
temp_bytes
orFreeLocalStorage
from a Query Plan:- Predicting the exact amount of
temp_bytes
orFreeLocalStorage
required from a query plan can be complex. However, you can estimate based on the number of rows processed and the operations involved (e.g., sorting, hashing). If the correlation between the number of rows andtemp_bytes
holds, you can use metrics from similar queries to make predictions.
- Predicting the exact amount of
Bests.
There are also some answers (incomplete though) on this in the thread below https://aws.amazon.com/blogs/database/tune-sorting-operations-in-postgresql-with-work_mem/
My issue is with tuning guide at https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.ManagingTempFiles.html recommending to turn on temp_bytes, but no explanation of what it means or how it helps choose instance size.
Quoting the same point i made in that thread "I'm trying to relate what we see from the query plan to temp_bytes. Is that possible? If no, then it's not a useful indicator, and the linked article in my 1st question recommending to turn temp_bytes on would not help query writing or instance sizing"
I will ignore temp_bytes for now and just observe FreeLocalStorage which relate strongly to number of rows scanned in query plan and the sum(size) column from
select CURRENT_TIMESTAMP, count(*), sum(size) from pg_ls_tmpdir();
The query above will show temp files being created in groups of (parallel workers+ 1 leader) files, growing to a size of 1GB before the next group of temp files are created. Meaning if max_parallel_workers_per_gather = 2, 3 files are created at once, filled to 1GB each at roughly uniform speed, then another set of 3 temp files appear, again filled to 1GB each, and so on
Relevant content
- asked 6 months ago
- Accepted Answerasked 4 months ago
- asked 6 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago