- Más nuevo
- Más votos
- Más comentarios
Just to add on to Gonzolo's response, Glue itself doesn’t provide direct functionality for retrieving the timestamp of when a file was saved in a S3 bucket. To retrieve such information, you can use Boto client methods to achieve this. Specifically you can use the:
- head_object() method [1]
- list_objects_v2() method [2]
Please see the external resource below to see example code of how this can be achieved.
Alternatively, you can save your Athena query and execute the saved query through a Glue job. Please see example code of this in this AWS Blog post [3].
References:
It would be better if the predictions has a timestamp, instead of relying on the file modification date, which could be affected by other things.
Otherwise, I don't think there is a way in Glue/Spark but you could invoke Athena from Glue and read the results (it's a bit wasteful since the cluster would be waiting while Athena is running) and then ask Glue to read the Athena query results and continue from there.
Contenido relevante
- OFICIAL DE AWSActualizada hace un año
- ¿Cómo puedo resolver el error «No queda espacio en el dispositivo» en un trabajo de ETL de AWS Glue?OFICIAL DE AWSActualizada hace un año
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 3 años