By using AWS re:Post, you agree to the Terms of Use
/AWS Lake Formation/

Questions tagged with AWS Lake Formation

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Connecting Users to AWS Athena and AWS Lake Formation via Tableau Desktop using the Simba Athena JDBC Driver and Okta as Identity Provider

Hello, due to the following Step by Step Guide provided by the official AWS Athena user-guide (Link at the End of the question), it should be possible to connect Tableau Desktop to Athena and Lake Formation via the Simba Athena JDBC Driver using Okta as Idp. The challenge that I am facing right now, is although i followed each step as documented in the Athena user-guide i can not make the connection work. The error message that i recieve whenever i try to connect Tableau Desktop states: > [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. The security token included in the request is invalid. [Execution ID not available] Invalid Username or Password. My athena.properties file to configure the driver on the Tableau via connection string URL looks as follows (User Name and Password are masked): ``` jdbc:awsathena://AwsRegion=eu-central-1; S3OutputLocation=s3://athena-query-results; AwsCredentialsProviderClass=com.simba.athena.iamsupport.plugin.OktaCredentialsProvider; idp_host=1234.okta.com; User=*****.*****@example.com; Password=******************; app_id=****************************; ssl_insecure=true; okta_mfa_type=oktaverifywithpush; LakeFormationEnabled=true; ``` The configuration settings used in here are from the official Simba Athena JDBC driver documentation (Version: 2.0.31). Furthermore i assigned the required permissions for my users and groups inside Lake Formation as stated in the Step by Step guide linked below. Right now I am not able to point out why I am not able to make the connection work. So I would be very greatful for any support / idea to find a solution on that topic. Best regards Link: https://docs.aws.amazon.com/athena/latest/ug/security-athena-lake-formation-jdbc-okta-tutorial.html#security-athena-lake-formation-jdbc-okta-tutorial-step-1-create-an-okta-account)
0
answers
0
votes
10
views
asked 2 days ago

Describe table in Athena fails with insufficient lake formation permissions

When I try to run the following query via the Athena JDBC Driver ```sql describe gitlab.issues ``` I get the following error: > [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. FAILED: SemanticException Unable to fetch table gitlab. Insufficient Lake Formation permission(s) on gitlab (Service: AmazonDataCatalog; Status Code: 400; Error Code: AccessDeniedException; Request ID: be6aeb1b-fc06-410d-9723-2df066307b35; Proxy: null) [Execution ID: a2534d22-c4df-49e9-8515-80224779bf01] the following query works: ```sql select * from gitlab.issues limit 10 ``` The role that is used has the `DESCRIBE` permission on the `gitlab` database and `DESCRIBE, SELECT` permissions on the table `issues`. It also has the following IAM permissions: ```json { "Version": "2012-10-17", "Statement": [ { "Action": [ "athena:BatchGetNamedQuery", "athena:BatchGetQueryExecution", "athena:CreatePreparedStatement", "athena:DeletePreparedStatement", "athena:GetDataCatalog", "athena:GetDatabase", "athena:GetNamedQuery", "athena:GetPreparedStatement", "athena:GetQueryExecution", "athena:GetQueryResults", "athena:GetQueryResultsStream", "athena:GetTableMetadata", "athena:GetWorkGroup", "athena:ListDatabases", "athena:ListNamedQueries", "athena:ListPreparedStatements", "athena:ListDataCatalogs", "athena:ListEngineVersions", "athena:ListQueryExecutions", "athena:ListTableMetadata", "athena:ListTagsForResource", "athena:ListWorkGroups", "athena:StartQueryExecution", "athena:StopQueryExecution", "athena:UpdatePreparedStatement" ], "Resource": "*", "Effect": "Allow" }, { "Action": [ "glue:BatchGetCustomEntityTypes", "glue:BatchGetPartition", "glue:GetCatalogImportStatus", "glue:GetColumnStatisticsForPartition", "glue:GetColumnStatisticsForTable", "glue:GetCustomEntityType", "glue:GetDatabase", "glue:GetDatabases", "glue:GetPartition", "glue:GetPartitionIndexes", "glue:GetPartitions", "glue:GetSchema", "glue:GetSchemaByDefinition", "glue:GetSchemaVersion", "glue:GetSchemaVersionsDiff", "glue:GetTable", "glue:GetTableVersion", "glue:GetTableVersions", "glue:GetTables", "glue:GetUserDefinedFunction", "glue:GetUserDefinedFunctions", "glue:ListCustomEntityTypes", "glue:ListSchemaVersions", "glue:ListSchemas", "glue:QuerySchemaVersionMetadata", "glue:SearchTables" ], "Resource": "*", "Effect": "Allow" }, { "Condition": { "ForAnyValue:StringEquals": { "aws:CalledVia": "athena.amazonaws.com" } }, "Action": [ "s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::aws-athena-query-results-123456789012-eu-west-1", "arn:aws:s3:::aws-athena-query-results-123456789012-eu-west-1/*", "arn:aws:s3:::aws-athena-federation-spill-123456789012-eu-west-1", "arn:aws:s3:::aws-athena-federation-spill-123456789012-eu-west-1/*" ], "Effect": "Allow" }, { "Action": [ "lakeformation:CancelTransaction", "lakeformation:CommitTransaction", "lakeformation:DescribeResource", "lakeformation:DescribeTransaction", "lakeformation:ExtendTransaction", "lakeformation:GetDataAccess", "lakeformation:GetQueryState", "lakeformation:GetQueryStatistics", "lakeformation:GetTableObjects", "lakeformation:GetWorkUnitResults", "lakeformation:GetWorkUnits", "lakeformation:StartQueryPlanning", "lakeformation:StartTransaction" ], "Resource": "*", "Effect": "Allow" }, { "Condition": { "ForAnyValue:StringEquals": { "aws:CalledVia": "athena.amazonaws.com" } }, "Action": "lambda:InvokeFunction", "Resource": "arn:aws:lambda:*:*:function:athena-federation-*", "Effect": "Allow" }, { "Condition": { "ForAnyValue:StringEquals": { "aws:CalledVia": "athena.amazonaws.com" } }, "Action": ["s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket"], "Resource": "*", "Effect": "Allow" } ] } ``` even if I make the role a LakeFormation Admin, Database Creator, assign Super Permissions to the table and database and add the AdministratorAccess IAM Policy to the role it still fails.
0
answers
0
votes
25
views
asked 13 days ago

_temp AWS lake formation blueprint pipeline tables appears to IAM user in athena editor although I didn't give this user permission on them

_temp lake formation blueprint pipeline tables appears to IAM user in Athena editor, although I didn't give this user permission on them below the policy granted to this IAM user,also in lake formation permsissions ,I didnt give this user any permissions on _temp tables: { "Version": "2012-10-17", "Statement": [ { "Sid": "Stmt1652364721496", "Action": [ "athena:BatchGetNamedQuery", "athena:BatchGetQueryExecution", "athena:GetDataCatalog", "athena:GetDatabase", "athena:GetNamedQuery", "athena:GetPreparedStatement", "athena:GetQueryExecution", "athena:GetQueryResults", "athena:GetQueryResultsStream", "athena:GetTableMetadata", "athena:GetWorkGroup", "athena:ListDataCatalogs", "athena:ListDatabases", "athena:ListEngineVersions", "athena:ListNamedQueries", "athena:ListPreparedStatements", "athena:ListQueryExecutions", "athena:ListTableMetadata", "athena:ListTagsForResource", "athena:ListWorkGroups", "athena:StartQueryExecution", "athena:StopQueryExecution" ], "Effect": "Allow", "Resource": "*" }, { "Effect": "Allow", "Action": [ "glue:GetDatabase", "glue:GetDatabases", "glue:BatchDeleteTable", "glue:GetTable", "glue:GetTables", "glue:GetPartition", "glue:GetPartitions", "glue:BatchGetPartition" ], "Resource": [ "*" ] }, { "Sid": "Stmt1652365282568", "Action": "s3:*", "Effect": "Allow", "Resource": [ "arn:aws:s3:::queryresults-all", "arn:aws:s3:::queryresults-all/*" ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": [ "*" ] } ] }
1
answers
0
votes
15
views
asked 2 months ago

AWS Lake Formation: (AccessDeniedException) when calling the GetTable operation: Insufficient Lake Formation permission(s) on table

I have implemented LakeFormation on my data bucket. I have a step function in which one step consists of running a GlueJob that reads and writes to the data catalog. I have upgraded my DataLake permissions as reported [here][1]. The Service Role that runs my Step Function has a root-type policy (granted just for debugging this issue): ```yaml Statement: - Effect: "Allow" Action: - "*" Resource: - "*" ``` On lake formation the service role has: - Administrator Rights - Database Creation rights (and Grantable) - Data Location access to the entire bucket (and Grantable) - Super rights on read and write Database (and Grantable) - Super rights on ALL tables within above Databases (and Grantable). The bucket is not encrypted. But, somehow, its access to the tables is denied with the error: ``` (AccessDeniedException) when calling the GetTable operation: Insufficient Lake Formation permission(s) on table ``` What's really strange is that the Glue Job succeeds when writing to some tables, and fails on others. And there is no real substantial difference across tables: all of them are under the same S3 prefix, parquet files, partitioned on the same key. Given the abundance of permissions granted, I am really clueless about what is causing the error. Please, send help. [1]: https://docs.aws.amazon.com/lake-formation/latest/dg/upgrade-glue-lake-formation.html
0
answers
0
votes
23
views
asked 3 months ago

Grouping of partitioned dataframes

I have a large dataset (table) with >1e9 records (rows) in Glue. The tables are partitioned by column A, which is a n-letters subtring of column B. For example: | A (partition key) | B | ... | | --- | --- | --- | | abc | abc123... | ... | | abc | abc123... | ... | | abc | abc456... | ... | | abc | abc456... | ... | | abc | abc456... | ... | | abc | abc789... | ... | | abc | abc789... | ... | | ... | ... | ... | | xyz | xyz123... | ... | | xyz | xyz123... | ... | | xyz | xyz123... | ... | | xyz | xyz456... | ... | | xyz | xyz456... | ... | | xyz | xyz456... | ... | | xyz | xyz789... | ... | | xyz | xyz789... | ... | There are >1e6 possible different values of column B and correspondingly significantly less for column A (maybe 1e3). Now I need to group records/rows by column B and the assumption is that it could be advantageous if the table was partitioned by column A, as it would be sufficient to load dataframes from single partitions for grouping instead of running the operation on the entire table. (Partitioning by column B would lead to unreasonably large numbers partitions.) Is my assumption right? How would I tell my Glue job the link between column A and B and profit from the partitioning? Alternatively I could handle the 1e3 dataframes (one for each partition) separately in my Glue job and merge them lateron. But this looks a bit complicated to me. This question is a follow-up question to https://repost.aws/questions/QUwxdl4EwTQcKBuL8MKCU0EQ/are-partitions-advantageous-for-groupby-operations-in-glue-jobs.
1
answers
0
votes
12
views
asked 4 months ago

authentication error with SAML + EMR + Lake formation

I have an errror when I try to login with an IDP (Auth0) and EMR integrated with Lake formation. I'm following the workshop [Lake formation & EMR integration](https://catalog.us-east-1.prod.workshops.aws/workshops/78572df7-d2ee-4f78-b698-7cafdb55135d/en-US/emr-integration ) I have configured an Auth0 account, aws IDP, EMR cluster (aws service) and data lake permissions with (idp users). But I have an error when I do the login with [EMR Zeppelin] (https://EMRMasterNodeDNS:8442/gateway/default/zeppelin/). I do the login with Auth0 and EMR but I can't do it with lakeformation. This is the error that I had on EMR proxy agent: `Caused by: java.lang.NullPointerException at org.apache.knox.gateway.util.SamlUtils.getSamlAwsRoleAttributeValues(SamlUtils.java:149) at org.apache.knox.gateway.pac4j.aws.AwsLakeFormationSamlImpl.getAwsCredentials(AwsLakeFormationSamlImpl.java:106) at org.apache.knox.gateway.pac4j.aws.AwsSamlHandler.processSamlResponse(AwsSamlHandler.java:78) at org.apache.knox.gateway.pac4j.filter.Pac4jDispatcherFilter.doFilter(Pac4jDispatcherFilter.java:234) at org.apache.knox.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:372) at org.apache.knox.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:272) at org.apache.knox.gateway.filter.XForwardedHeaderFilter.doFilter(XForwardedHeaderFilter.java:30) at org.apache.knox.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:61)` I think that I need to do the step 6 on the documentation [amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-lf-federation.html). But I don't know were I have to do this configuration. Any help? Thank you
0
answers
0
votes
7
views
asked 4 months ago

LakeFormation assumed role cannot access s3 objects during Athena requests ("Permission denied on S3 path" )

I have Delta Lake tables (using Symlink text input format) catalogued in Glue, stored in a S3 bucket, with all its resources tagged with LakeFormation Tags (for tag-based governance). The problem is that, although the users can see the database, tables, and metadata within Athena's catalogue, they cannot perform queries against the specific tables because of "Permission denied on S3 path" errors. LakeFormation has the data location registered for the datalake bucket, with AWSServiceRoleForLakeFormationDataAccess role. And this role has IAM permissions automatically added to the resources: ``` LakeFormationDataAccessServiceRolePolicy { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListAllMyBuckets" ], "Resource": [ "arn:aws:s3:::*" ] } ] } ``` and ``` LakeFormationDataAccessPolicyForS3 { "Version": "2012-10-17", "Statement": [ { "Sid": "LakeFormationDataAccessPermissionsForS3", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::MYBUCKET/*" ] }, { "Sid": "LakeFormationDataAccessPermissionsForS3ListBucket", "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::MYBUCKET" ] } ] } ``` I have also tried registering the data location with a role with Admin permissions (Action "*" and Resources "*"), but even so the same error is thrown. Looking through the CloudTrail logs, I found that LakeFormation passes custom policies to the role when running AssumeRole: ``` "policy": "{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Action\": [\"s3:GetObject\"],\n \"Effect\": \"Allow\",\n \"Resource\": [\"arn:aws:s3:::MYBUCKET\",\n \"Condition\": {\"ForAnyValue:StringLike\":{\"s3:prefix\":[\"MYTABLE/_symlink_format_manifest\",\"MYTABLE/_symlink_format_manifest/*\"]}}\n },\n {\n \"Action\": [\"kms:Decrypt\"],\n \"Effect\": \"Allow\",\n \"Resource\": [\"*\"],\n \"Condition\": {\"StringEquals\":{\"kms:ViaService\":[\"s3.us-east-2.amazonaws.com\"]}}\n } ]\n}" } ``` This seems like a malformatted json string that is being passed to the assumed role. Can this be causing the errors I'm having? And does anyone have had this issue before? PS: I have manually removed ACL control over the S3 bucket and objects. Still same behavior. The error is not shown if I remove the data location, and Athena ignores Lake Formation.
1
answers
0
votes
298
views
asked 5 months ago
  • 1
  • 90 / page