Data Quality using PyDeequ


Hi, Does anyone use PyDeequ for large enterprises. I am exploring this library and have the below questions:

  1. Looking at the github repo it doesnt seem like it is actively udated. ALso, it supoorts Spark 3.0.0 but not later versions.
  2. Some of the apis didnt work(for complex examples). I dont know if there is any Amazon support.
  3. Also the scala version(deequ) is more up to date than the python version(PuDeequ). s is there a plan to sunset the PyDeequ version
  4. Should I use this for large enterprise data validation framework or there are any other alternate tools. Kindly advise.

Thank you!

asked a year ago525 views
1 Answer


To answer question '4' - I would recommend you take a look at AWS Glue DataBrew. Not only is it a fully managed service, but you'll also find that it has a better velocity of new features & updates as its supported by the AWS Glue team.



answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions