Data Quality using PyDeequ

0

Hi, Does anyone use PyDeequ for large enterprises. I am exploring this library and have the below questions:

  1. Looking at the github repo it doesnt seem like it is actively udated. ALso, it supoorts Spark 3.0.0 but not later versions.
  2. Some of the apis didnt work(for complex examples). I dont know if there is any Amazon support.
  3. Also the scala version(deequ) is more up to date than the python version(PuDeequ). s is there a plan to sunset the PyDeequ version
  4. Should I use this for large enterprise data validation framework or there are any other alternate tools. Kindly advise.

Thank you!

demandé il y a 2 ans669 vues
1 réponse
0

Hi

To answer question '4' - I would recommend you take a look at AWS Glue DataBrew. Not only is it a fully managed service, but you'll also find that it has a better velocity of new features & updates as its supported by the AWS Glue team.

Thanks

Nick

AWS
Nick
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions