current_time minus 1hr in Glue Pyspark

0

I need to fetch files that has arrived current_time - 1hr from my S3 bucket for processing. My files name will be in format yyyymmdd-hhmmsssss.parquet (includes milli seconds also). So I am running a glue job to fetch the files that has file name for <= current_timestamp-1hr. Below code, I have used to fetch the time in required format desired_timezone = pytz.timezone('America/New_York') # Replace 'Your_Time_Zone' with your actual time zone current_datetime_2 = datetime.now(desired_timezone).strftime("%Y%m%d-%H%M%S")

I do not know, how to display time for current_time-1hr using above commands in Glue job pyspark code. Can someone please help me to achieve this?

1 Antwort
1
Akzeptierte Antwort

Just subtract an hour from the current time with timedelta(hours=1) and format it like your file names using strftime("%Y%m%d-%H%M%S").

You will have something like:

from datetime import datetime, timedelta
import pytz

desired_timezone = pytz.timezone('America/New_York')  # Replace 'Your_Time_Zone' with your actual time zone
current_datetime = datetime.now(desired_timezone)
one_hour_ago_datetime = current_datetime - timedelta(hours=1)

formatted_current_datetime = current_datetime.strftime("%Y%m%d-%H%M%S")
formatted_one_hour_ago_datetime = one_hour_ago_datetime.strftime("%Y%m%d-%H%M%S")

print("Current time:", formatted_current_datetime)
print("One hour ago:", formatted_one_hour_ago_datetime)

Resources:

profile picture
EXPERTE
beantwortet vor 2 Monaten
AWS
SUPPORT-TECHNIKER
überprüft vor einem Monat
  • thanks a lot. The way you added the TIMEDELTA made the difference. Your solution worked for me :)

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen