Amazon AWS hadoop- how to download a file from s3 in map-reduce program in

0

I am trying to make a quite simple map-reduce program that as part of the map task it needs to load a small file from S3 (something that can be saved in the local memory).

In the map reduce the task is to check each line against this file and generate a feature vector (and so on...)

I am having troubles with making my setup method downloading and accessing this file.
I think the problem is with forwarding the credentials but it might as well be in the way I am accessing the file in the setup function (I put code snippets below).

In my map-reduce logs the error I'm getting is:
"profile file cannot be null"
about this line:

            AWSCredentialsProvider credentialsProvider = new AWSStaticCredentialsProvider(new ProfileCredentialsProvider().getCredentials());

I tried many other ways and got nowhere, if there is anyway you can guide me it will be great.

In my main (running local on my pc) java file I do:

AmazonElasticMapReduce mapReduce =AmazonElasticMapReduceClientBuilder.standard().withRegion("us-east-1").build();
HadoopJarStepConfig hadoopJarStep = new HadoopJarStepConfig()
                .withJar("MYPATH")  // This should be a full map reduce application.
                .withMainClass("MYMAIN");
       

in the mapreduce jar (the one that runs on the ec2 task) I define the relevant jobs and the job contorl, and in the setup function of the map class I do:

AWSCredentialsProvider credentialsProvider = new AWSStaticCredentialsProvider(new ProfileCredentialsProvider().getCredentials());
            AmazonS3 s3 = AmazonS3ClientBuilder.standard()
                    .withCredentials(credentialsProvider)
                    .withRegion("us-east-1")
                    .build();
            String S3Bucket = "MYBUCKET";  // the bucket where the file is located
            String S3Key = "MYKEY"; // the name of the file
            S3Object object = s3.getObject(new GetObjectRequest(S3Bucket, S3Key)); // bucket, key
            S3ObjectInputStream summaryInputStream = object.getObjectContent();
            BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(summaryInputStream));

I also tried to hard code my credentials in the main (run local on my pc) like this:

String access_key = "aws_access_key_id=MYKEY";
        String secret_key = "aws_secret_access_key=MYSECRETKEY";

        BasicAWSCredentials creds = new BasicAWSCredentials(access_key, secret_key);
        AWSStaticCredentialsProvider awsCred = new AWSStaticCredentialsProvider(creds);
        AmazonElasticMapReduce mapReduce =
                AmazonElasticMapReduceClientBuilder.standard().withRegion("us-east-1").withCredentials(awsCred).build();

also with no success (I tried with excluding the "aws_access_key_id=" and the "aws_secret_access_key=" from the strings)

Thanks!

noampa
질문됨 4년 전340회 조회
3개 답변
0
수락된 답변

Your first issue looks like you are having a credential issue with Java SDK V1.

Also - I would recommend moving to V2: https://github.com/awsdocs/aws-doc-sdk-examples/tree/master/javav2.

For V1 - simply try to follow this code example to create a bucket so we can focus on your cred issue.

https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/java/example_code/s3/src/main/java/aws/example/s3/CreateBucket.java

I recommend placing your credentials here - as discussed in the docs:

The default credential profiles file– typically located at ~/.aws/credentials (location can vary per platform), and shared by many of the AWS SDKs and by the AWS CLI. The AWS SDK for Java uses the ProfileCredentialsProvider to load these credentials.

Then you can create your S3 Service client like this:

AmazonS3 s3 = AmazonS3ClientBuilder.standard()
.withRegion(Regions.DEFAULT_REGION)
.build();
I just ran this Java V1 example and it created a bucket perfectly.

Try this and post back what happened.

답변함 4년 전
0

Thanks a lot, that did the trick

noampa
답변함 4년 전
0

I am glad it worked for you!!!

Edited by: PowerUserScott on Jul 9, 2020 10:24 AM

답변함 4년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠