I don't seem to be able to redirect stderr from spark-submit to a file on S3...
1. As a basic test, this works:
sh-4.2$ cat ./testit.sh
#!/bin/sh
echo hello
sh-4.2$ ./testit.sh | aws s3 cp - s3://mybucket/test.out
2. This creates a file in s3 upon submission, but it is empty file when the command completes:
sudo spark-submit
--class ${PACKAGE}.${CLASS}
--deploy-mode client
--master yarn
--jars $JARS
--executor-cores 2
--conf spark.ui.enabled=false
--conf spark.network.timeout=1200000
$JAR
$ARGS
| aws s3 cp - s3://mybucket/test.out
3. This does not even create a file at all:
sudo spark-submit
--class ${PACKAGE}.${CLASS}
--deploy-mode client
--master yarn
--jars $JARS
--executor-cores 2
--conf spark.ui.enabled=false
--conf spark.network.timeout=1200000
$JAR
$ARGS
2> aws s3 cp - s3://a206739-etls3-dsdc-ci-use1/test.out
4. This creates a file locally. I could then copy to s3, but that's not what I want to do:
sudo spark-submit
--class ${PACKAGE}.${CLASS}
--deploy-mode client
--master yarn
--jars $JARS
--executor-cores 2
--conf spark.ui.enabled=false
--conf spark.network.timeout=1200000
$JAR
$ARGS
2> test.out
I don't understand why #2 creates the file, but does not send output to it. Is there a way to make this work like that?