1 Answer
0
Accepted Answer
Hello. There are many factors to this. I am listing some below:
a) What is the instance configuration? Is it sufficient? Do you want to reconsider it?
b) Is auto scaling turned on?
c) What does Spark UI say? Which task takes most time? Is it task that takes more time or more time is spent on waiting for resources?
c) Read over JDBC , how many parallel connections are being used?
d) Are you using dynamic partitions?
These are some high level checklist which needs to be answered.
Most important is the code , are you using repartiton/coalesce? Are you using any collect in code? Code is the main factor which usually causes performance issues. Please feel free to reach out to me if you will need any additional information.
answered 2 years ago
Relevant content
- asked 2 months ago
- Accepted Answerasked 6 months ago
- asked a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 months ago