Skip to content

Trying to get list of databases from AWS Glue using AWS Glue SDK for Java 1.X with limited resource access on policy but getting empty list

0

I have plenty of databases listed in Glue. I am using a policy with limited resource access so that I can only see specific type of databases.

These are my policies with respective permission: 1.resource-policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "glue:GetConnections",
                "glue:GetTables",
                "glue:GetTableVersions",
                "glue:GetDatabases",
                "glue:GetTable",
                "glue:GetDatabase",
                "glue:GetJob",
                "glue:StartJobRun",
                "glue:BatchStopJobRun",
                "glue:GetCrawler"
            ],
            "Resource": [
                "arn:aws:glue:*:248135293344:connection/*",
                "arn:aws:glue:*:248135293344:job/*",
                "arn:aws:glue:*:248135293344:database/glue*",
                "arn:aws:glue:*:248135293344:catalog",
                "arn:aws:glue:*:248135293344:table/glue*"
            ]
        }
    ]
}

S3 policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketAcl"
            ],
            "Resource": "arn:aws:s3:::*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "s3:ListAllMyBuckets",
            "Resource": "*"
        }
    ]
}

When I do a query through AWS CLI using the following command:

aws glue get-databases --query "DatabaseList[*].Name"

result of aws CLI query

But when I try to do the query using AWS SDK for Java I get no results. These are my code snippets and results:

package alation.awsglue.mde.extractor;

import alation.aws.auth.AWSAuthentication;
import alation.awsglue.Main;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.glue.AWSGlue;
import com.amazonaws.services.glue.AWSGlueClientBuilder;
import com.amazonaws.services.glue.model.AWSGlueException;
import com.amazonaws.services.glue.model.Database;
import com.amazonaws.services.glue.model.GetDatabasesRequest;
import com.amazonaws.services.glue.model.GetDatabasesResult;
import org.apache.log4j.PropertyConfigurator;

import java.util.List;

public class sampleMDE {
    public static void main(String[] args) {
        System.setProperty("com.amazonaws.sdk.enableDefaultMetrics", "true");
        PropertyConfigurator.configure(Main.class.getClassLoader().getResource("log4j.properties"));
        AWSGlue amazonGlueClient = null;
        try{
        // Initialize the Glue client
        AWSCredentialsProvider awsCredProvider =
            new AWSAuthentication(
                "<your_key>",
                "<your_id>")
                .getCredential();
        amazonGlueClient = AWSGlueClientBuilder.standard()
            .withCredentials(awsCredProvider)
            .withRegion(Regions.US_EAST_1).build();
    } catch (Exception e) {
        System.err.println("Failed to create Glue client: " + e.getMessage());
        e.printStackTrace();
        return;
    }
        try{

        // Create the request
        GetDatabasesRequest request = new GetDatabasesRequest();

        // Get the list of databases
        GetDatabasesResult response = amazonGlueClient.getDatabases(request);

        // Process the response
        List<Database> databases = response.getDatabaseList();
        if (databases.isEmpty()) {
            System.out.println("No databases found.");
        }
        for (Database database : databases) {
            System.out.println("Database name: " + database.getName());
        }
    } catch (AWSGlueException e) {
        System.err.println("Error occurred while retrieving databases: " + e.getErrorMessage());
        e.printStackTrace();
    }

    }
}

result of AWS Glue SDK

1 Answer
0

The issue you're facing is likely due to the resource policy you've set up for AWS Glue. In your policy, you're allowing access to databases with the prefix "glue*". However, the GetDatabasesRequest without any additional parameters will return all databases within the AWS account.

To get the list of databases with the prefix "glue*", you need to add a filter to the GetDatabasesRequest. Here's how you can modify your code:

try {
    // Create the request
    GetDatabasesRequest request = new GetDatabasesRequest();
    request.withResourceShareType("ALL"); // Add this line to get databases across all resource shares

    // Add a prefix filter for "glue*"
    request.setCatalogId("<your_aws_account_id>"); // Replace with your AWS account ID
    request.withResourceShareType("ALL"); // Add this line to get databases across all resource shares
    request.setDatabaseNamePrefix("glue"); // Set the prefix filter

    // Get the list of databases
    GetDatabasesResult response = amazonGlueClient.getDatabases(request);

    // Process the response
    List<Database> databases = response.getDatabaseList();
    if (databases.isEmpty()) {
        System.out.println("No databases found matching the prefix 'glue*'.");
    }
    for (Database database : databases) {
        System.out.println("Database name: " + database.getName());
    }
} catch (AWSGlueException e) {
    System.err.println("Error occurred while retrieving databases: " + e.getErrorMessage());
    e.printStackTrace();
}

In the modified code:

  1. We set the CatalogId to your AWS account ID, which is required for the prefix filter to work correctly.
  2. We add the withResourceShareType("ALL") parameter to get databases across all resource shares.
  3. We set the DatabaseNamePrefix to "glue" to filter the databases based on the prefix.

With these changes, you should be able to retrieve the list of databases starting with "glue*" based on the resource policy you've set up.

Note: Make sure to replace <your_aws_account_id> with your actual AWS account ID.

EXPERT
answered a year ago
  • Hi @Giovanni Lauria, setDatabaseNamePrefix is not an API call in SDK V1.x. I am restricted to using this SDK as of now in my organization. Besides, if the policy is restricting to selected databases, then wouldn't getDatabases give out only those databases ?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.