So I understand that the nodes don't have external connectivity except the master, so you cannot run DistCP even inside the cluster.
I think the easiest would be to create a script that runs on the master and takes files onto the local dist and uses the standard aws s3 command line client to uploading (tweaking a bit the bandwidth and parallelism).
The other option if you don't want to do the temporary local copy would be to run DistCp in local mode, so it runs only on the master but can access hdfs and s3 directly.
AFAIK, the web solutions you propose to access the cluster externally, would require the DataNodes to be reachable (the master doesn't actually have the data).
The workaround would be to use some proxy service like Knox but it's too much hassle to handle all the security compared with the option of running a script on the cluster master.
How to Download any file from s3 using the pyspark kernel in emr notebook ?asked 2 years ago
Is able Redshift to restore leader node upon fail as it would do with compute nodes?Accepted Answerasked 7 years ago
Cross account copy from S3 to Deep ArchiveAccepted Answerasked 2 years ago
All nodes have disappeared from the elastic search cluster.asked 10 months ago
How is data returned from Spectrum to Redshift cluster?Accepted Answerasked 3 years ago
Add encryption to a multi-az cluster db snapshot?asked 2 months ago
DMS CDC to Redshift during cluster resizeAccepted Answerasked 2 years ago
How to copy a large dataset from on-premises Hadoop Cluster to S3?Accepted Answerasked 3 years ago
Hadoop across multiple regionsasked 3 years ago
Have a service running in an ECS cluster dynamically create a new service in the same clusterasked 3 months ago