1 Respuesta
- Más nuevo
- Más votos
- Más comentarios
0
Yes, for that kind of query you need to concatenate.
However, catalogPartitionPredicate is a server filter and has limited capabilities.
Instead you can use push_down_predicate, it accepts SparkSQL syntax so you can do that in multiple ways, the simplest is probably:
year || month >= '202112'
You can also keep catalogPartitionPredicate: year>='2021' so it reduces the number of partitions listed on the server side.
Contenido relevante
- OFICIAL DE AWSActualizada hace un año
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 2 años
I am currently using this approach. This approach doesn't allow to fully utilize catalogPartitionPredicate. For example, if current month is equal to 7 and we apply catalogPartitionPredicate: year>='2021'; it will still bring last 6 months data.
You do need the "push_down_predicate" property to do the filtering and prevent reading the data, if in addition you add catalogPartitionPredicate: year>='2021' you can reduce the list of partitions retrieved from the catalog (partitions, not data) but that's optional, the important is push_down_predicate