By using AWS re:Post, you agree to the Terms of Use
/Analytics/

Analytics

AWS provides the broadest selection of analytics services that fit all your data analytics needs and enables organizations of all sizes and industries reinvent their business with data. From data movement, data storage, data lakes, big data analytics, machine learning, and anything in between, AWS offers purpose-built services that provide the best price-performance, scalability, and lowest cost.

Recent questions

see all
1/18

metadata service is unstable: connection timeout, Failed to connect to service endpoint etc

start from recently, our long running job are hitting metadata issue frequently. The exceptions various, but the all point to EC2 metadata service. It's either failed to connection the endpoint, or timeout to connect to the service, or complaining that I need to specify the region while building the client. The job is running on EMR 6.0.0 in Tokyo, with correct Role set, and the job has been running fine for months, just started from recent, it became unstable. So my question is: how can we monitor the healthy the metadata service? request QPS, success rate, etc. A few callstacks ``` com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazon.ws.emr.hadoop.fs.guice.UserGroupMappingAWSSessionCredentialsProvider@4a27ee0d: null, com.amazon.ws.emr.hadoop.fs.HadoopConfigurationAWSCredentialsProvider@76659c17: null, com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.auth.InstanceProfileCredentialsProvider@5c05c23d: Failed to connect to service endpoint: ] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:136) ``` ``` com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region. at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) ``` ``` com.amazonaws.SdkClientException: Unable to execute HTTP request: mybucket.s3.ap-northeast-1.amazonaws.com at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1189) ~[aws-java-sdk-bundle-1.11.711.jar:?] Caused by: java.net.UnknownHostException: mybucket.s3.ap-northeast-1.amazonaws.com at java.net.InetAddress.getAllByName0(InetAddress.java:1281) ~[?:1.8.0_242] at java.net.InetAddress.getAllByName(InetAddress.java:1193) ~[?:1.8.0_242] at java.net.InetAddress.getAllByName(InetAddress.java:1127) ~[?:1.8.0_242] ``` ``` com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Failed to connect to service endpoint: Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ```
0
answers
0
votes
1
views
Hubery
asked 19 hours ago

AWS ElasticSearch returning DIFFERENT results in Kibana and http request in browser for the exact same query

I am running this kibana query: I have this query in Kibana: GET nearby/_search { "from": 20, "size":20, "query": { "bool": { "must": { "match": { "X": "B" } }, "filter": { "geo_distance": { "distance": "3.0km", "PO": { "lat": 26.8466937, "lon": 80.94616599999999 } } } } } } and response to this is: all the responses are with X=B: 20 results are there, i have removed some fields and some docs to keep the post short { "took" : 228, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 71, "relation" : "eq" }, "max_score" : 2.5032558, "hits" : [ { "_index" : "nearby", "_type" : "_doc", "_id" : "n3YeKvJqvpu1okE7QDBp", "_score" : 2.2831507, "_source" : { "PO" : "tuc89gfn0", "X" : "B" } }, { "_index" : "nearby", "_type" : "_doc", "_id" : "5FPJ2eyr0YoQ9F0xPYzW", "_score" : 2.2831507, "_source" : { "PO" : "tuc89gfn0", "X" : "B" } }, { "_index" : "nearby", "_type" : "_doc", "_id" : "QJflnqGKF1dpOjEaY8vy", "_score" : 2.2831507, "_source" : { "PO" : "tuc89gvr8", "X" : "B" } }] } } This is the browser REQUEST, QUERY REMAINS SAME: https://search-wul8888888.ap-south-1.es.amazonaws.com/nearby/_search?q="{"from":20,"size":20,"query":{"bool":{"must":{"match":{"X":"B"}},"filter":{"geo_distance":{"distance":"3km","PO":{"lat":26.8466937,"lon":80.94616599999999}}}}}}" This is the response: as u can see there are mostly X=I docs i.e. must-match isnt honoured, SECOND THING IS THAT I AM SENDING SIZE=20 BUT I GET 10 REULTS ONLY WHICH IS DEFAULT(BELOW I HAVE REMOVED EXTRA docs TO KEEP THE POST SHORT) {"took":149,"timed_out":false, "_shards":{"total":5,"successful":5,"skipped":0,"failed":0}, "hits":{"total":{"value":802,"relation":"eq"},"max_score":8.597985, "hits":[ {"_index":"nearby","_type":"_doc","_id":"iJ71MNq4a4TCkcT4vWSP","_score":8.597985,"_source":{//EXTRA FIELDS REMOVED "PO":"tuc8unwp7","X":"I","BI":"tRhKrWiDxFSt57tIH7g5"}}, {"_index":"nearby","_type":"_doc","_id":"PmngNe8WcC8aSraDMluz","_score":7.3973455,"_source":{"PO":"tuc8uhc5z",**"X":"I"**,"BI":"m3S6yEicvu1HFI1UOTIb"}}, {"_index":"nearby","_type":"_doc","_id":"lDqjflPZGYsymPGU8iHD","_score":7.1520696,"_source":{"PO":"tuc89wpg5","X":"B"}}, {"_index":"nearby","_type":"_doc","_id":"QIf2KsO4FpCjT3m7kH4I","_score":6.402881,"_source":{"PO":"tuc8uhc5z","X":"I","BI":"m3S6yEicvu1HFI1UOTIb"}}]}} PLEASE HELP. I TRIED EVERYTHING BUT NOT ABLE TO UNDERSTAND. MY HUNCH IS THAT EVERY TIME I M BEING RETURNED A STALE/old RESULT BUT dont know how to fix that. even in chrome incognito mode result for browser is same as above. Even if i change the radius in browser, result remains same which says clearly browser queries are getting the stale result.
0
answers
0
votes
2
views
PalmGini
asked 2 days ago

Glue ETL Job not working with error: o122.relationalize. com.amazonaws.services.glue.util.HadoopDataSourceJobBookmarkState cannot be cast to com.amazonaws.services.glue.util.RelationalizeJobBookmarkSt

Hello, I have an etl job that converts json -> parquet and was working up until I enabled job bookmarking via cloudformation. Now when my job runs I get the following error: ``` An error occurred while calling o122.relationalize. com.amazonaws.services.glue.util.HadoopDataSourceJobBookmarkState cannot be cast to com.amazonaws.services.glue.util.RelationalizeJobBookmarkState ``` # cloudformation.yml ``` AnalyticsGlueJob: Type: 'AWS::Glue::Job' Properties: Role: !Ref AnalyticsGlueRole Command: Name: 'glueetl' ScriptLocation: !Sub 's3://${AnalyticsS3Bucket}/analytics_etl.py' GlueVersion: '3.0' DefaultArguments: '--connection_type': 's3' '--db_name': !Ref AnalyticsGlueDatabase '--enable-metrics': '' '--job-bookmark-option': 'job-bookmark-enable' '--s3_dest': !Sub 's3://${AnalyticsS3ParquetBucket}/logs/' '--table_name': 'logs' '--temp_dir': !Sub 's3://${AnalyticsS3ParquetBucket}/temp/' ``` # etl_job.py ``` sc = SparkContext() gc = GlueContext(sc) spark = gc.spark_session job = Job(gc) job.init(args['JOB_NAME'], args) dyf = gc.create_dynamic_frame.from_catalog( database=DATABASE, table_name=TABLE_NAME, transformation_ctx='bookmark_ctx', ) # transform certain field types. dyf = ApplyMapping.apply( frame=dyf, mappings=[ ('authorizer.error', 'string', 'authorizer.error', 'string'), ('authorizer.latency', 'string', 'authorizer.latency', 'int'), ('authorizer.principal', 'string', 'authorizer.principal', 'string'), ('authorizer.requestId', 'string', 'authorizer.requestId', 'string'), ('authorizer.status', 'string', 'authorizer.status', 'int'), ('caller', 'string', 'caller', 'string'), ('httpmethod', 'string', 'httpmethod', 'string'), ('ip', 'string', 'ip', 'string'), ('partition_0', 'string', 'partition_0', 'string'), ('partition_1', 'string', 'partition_1', 'string'), ('partition_2', 'string', 'partition_2', 'string'), ('partition_3', 'string', 'partition_3', 'string'), ('path', 'string', 'path', 'string'), ('protocol', 'string', 'protocol', 'string'), ('requestid', 'string', 'requestid', 'string'), ('requesttime', 'string', 'requesttime', 'timestamp'), ('responselength', 'string', 'responselength', 'int'), ('status', 'string', 'status', 'int'), ('user', 'string', 'user', 'string'), ('useragent', 'string', 'useragent', 'string'), ], transformation_ctx='applymapping_ctx', ) # flatten nested json. dyf = Relationalize.apply( frame=dyf, staging_path=TEMP_DIR, name='root', transformation_ctx='relationalize_ctx', ) dyf = dyf.select('root') # write partitioned parquet files. dyf = gc.write_dynamic_frame.from_options( frame=dyf, connection_type=CONNECTION_TYPE, connection_options={ 'path': S3_DEST, 'partitionKeys': [ 'partition_0', 'partition_1', 'partition_2', 'partition_3', ], }, format='glueparquet', transformation_ctx='parquet_ctx', ) job.commit() ``` The error mentions `Relationalize` which I'm using to flatten a nested json structure, but I'm not sure why it's failing? Any help appreciated! ## edit I think I may have it working but I'm noticing that if the job runs without any new data in the source bucket, it will fail and throw an error rather than show as succeeded. ``` IllegalArgumentException: Partition column partition_0 not found in schema ``` When new records arrive in the source bucket and the job runs again, it is working and showing succeeded... Still interested in feedback!
1
answers
0
votes
3
views
borg
asked 3 days ago

S3 event notification Glue Crawler fails with Internal Service Exception

We are using Glue Crawler and switched to the S3 event notification 2 month or so ago. It has been failing with Internal Service exception fairly regularly so we had to switch back to scan all folders, run the crawler that way and then switch back to S3 event notification again. Previously problem seemed to happen when there were many events in SQS (40k+). S3 event based would work ok on the smaller amounts. Since the start of the 2022 year, event based crawler has been constantly failing with Internal Service Exception. Even after successfully running the all folder scan and switching back to the event based with just over 1k of events, it is still failing. No other errors provided in the log. The S3 full folder scan Crawler runs successfully however it takes 6+ hours to complete which is not acceptable for us. I did check https://aws.amazon.com/premiumsupport/knowledge-center/glue-crawler-internal-service-exception/ for this issue however none of the mentioned conditions exist in our case and as I mentioned the full scan crawler runs without issues. Log: ``` INFO : Crawler configured with Configuration { "Version": 1, "CrawlerOutput": { "Partitions": { "AddOrUpdateBehavior": "InheritFromTable" } }, "Grouping": { "TableGroupingPolicy": "CombineCompatibleSchemas", "TableLevelConfiguration": 2 } } and SchemaChangePolicy { "UpdateBehavior": "UPDATE_IN_DATABASE", "DeleteBehavior": "DEPRECATE_IN_DATABASE" } ERROR : Internal Service Exception ``` Any help would be appreciated.
2
answers
0
votes
7
views
AWS-User-4658641
asked 4 days ago

Popular users

see all
1/18

Learn AWS faster by following popular topics

1/2