Quality rules export/import

0

is possible to import/export glue data quality rules?, if yes how, tkx

  • Yes, it is possible to import and export AWS Glue data quality rules using AWS Glue APIs or the AWS Management Console.

    To export a data quality rule, you can use the get-data-quality-rule API to retrieve the rule definition, or you can use the AWS Management Console to download the rule definition as a JSON file.

    To import a data quality rule, you can use the create-data-quality-rule API to create a new rule based on the definition in a JSON file, or you can use the AWS Management Console to upload the JSON file and create a new rule based on the definition.

    Here are the general steps to import/export AWS Glue data quality rules using the AWS Management Console:

    Exporting a data quality rule:

    Open the AWS Glue console and go to the Data quality rules page. Select the rule you want to export and click the "Download JSON" button. Save the downloaded JSON file to your local computer. Importing a data quality rule:

    Open the AWS Glue console and go to the Data quality rules page. Click the "Import data quality rule" button. In the "Import data quality rule" dialog box, click the "Choose file" button and select the JSON file containing the rule definition. Click the "Import" button to create the new data quality rule. Note that there may be some limitations or requirements when importing and exporting data quality rules, such as the need for the Glue version to match or the need to have the necessary permissions to perform these operations.

  • there is no explicit option in glue to import/export a quality rules, i can edit some Glue Studio job and see the rules in "transform" tab i.e:

    ...

    e.g. Completeness "colA" between 0.4 and 0.8 */

    Rules = [ CustomSql "select id, count(*) from primary group by 1" > 1 ] I can copy the that rules and paste in some file, then in another region create a job and paste that rule.

    the same thing happens in databrew, i can see the "DataQuality rulsetNames" defined but cannot export/import/download them.

    The question is, if is there a way to download/import/export the rules or is necessary to do some lambda etc to do that.

Willi5
asked a year ago386 views
1 Answer
0
Accepted Answer

Hi

Yes it is possible; using API calls.

Use

          get-data-quality-ruleset *in case you already have a rule set defined* 

          create-data-quality-ruleset *in case you want to create a new rule*

Documentation:

          https://awscli.amazonaws.com/v2/documentation/api/latest/reference/glue/create-data-quality-ruleset.html

          https://awscli.amazonaws.com/v2/documentation/api/latest/reference/glue/get-data-quality-ruleset.html

For example: To list all the rulesets you already have set up, log into Cloudshell via console and use:

          ws glue list-data-quality-rulesets 

This will return the following information for each individual ruleset:

[cloudshell-user@ip-10-4-178-207 ~]$ aws glue list-data-quality-rulesets

{

"Rulesets": [


    {


        "Name": "my-rule-set",


        "Description": "Check data quality in table XXXXXXX in DB yyyyyy",


        "CreatedOn": "2023-03-14T06:56:55.596000+00:00",


        "LastModifiedOn": "2023-03-14T06:56:55.596000+00:00",


        "TargetTable": {


            "TableName": "XXXXXXX",


            "DatabaseName": "yyyyyy"


        },


        "RecommendationRunId": "dqrun-e07163bcdf71351494305ec41feb18e969bd77dd",


        "RuleCount": 64


    }


]

}

[cloudshell-user@ip-10-4-178-207 ~]$

You will then need to decide your preferred way to call these APIs e.g. call create-data-quality-ruleset via Lambda *you need to pass the two paramaters "name" and "ruleset" *``

noza
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions