1 Answer
- Newest
- Most votes
- Most comments
0
For EMR Studio to connect to EMR on EKS, a managed endpoint needs to be created. This managed endpoint needs to be configured to use Hive as the catalog and to point to Glue. Use the following CLI to configure the managed endpoint that is able to connect to Glue as the catalog:
aws emr-containers create-managed-endpoint \
--type JUPYTER_ENTERPRISE_GATEWAY \
--virtual-cluster-id ${virtclusterid} \
--name virtual-emr-endpoint \
--execution-role-arn ${role_arn} \
--release-label ${emr_release_label} \
--certificate-arn ${certarn} \
--region ${region} \
--configuration-overrides '{
"applicationConfiguration": [
{
"classification": "spark-defaults",
"properties": {
"spark.hadoop.hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory",
"spark.sql.catalogImplementation": "hive"
}
}
]
}'
See the following documentation for more information about the various flags and description: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-create-eks-cluster.html
answered 3 years ago
Relevant content
- asked 2 years ago
- asked 18 hours ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago