MWAA - can I use external programs (non-python) - sqlcmd or bcp?

0

Can i install/deploy a separate tool like sqlcmd or bcp into mwaa? these are linux tools to help move data around on MS SQL Server. we use this on-prem in our Airflow linux boxes but don't know if this is supported in MWAA.

Basically, i need to install non-pip dependency on MWAA

https://docs.microsoft.com/en-us/sql/tools/sqlcmd-utility?view=sql-server-ver16 https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-migrate-bcp?view=sql-server-ver16

已提问 2 年前521 查看次数
2 回答
1
已接受的回答

Hi,

Please note, you can run bash commands in MWAA instance using BashOperator task. However, installing mysql-tools requires sudo command and in MWAA, running sudo commands is not allowed.

You can try the following methods if they are suitable to you:

  1. You can use RDS MySQL server with MWAA and establish connection between them. For more information, kindly check this documentation https://docs.aws.amazon.com/mwaa/latest/userguide/samples-sql-server.html . To create Apache Airflow v2 connection, please follow the “Create a new Apache Airflow connection“ section in https://docs.aws.amazon.com/mwaa/latest/userguide/samples-ssh.html#samples-ssh-connection

  2. You can also create an Airflow task that connects to an EC2 instance and execute a BCP command/script in that instance. For this, you need to create an EC2 instance in public subnet and setup with the required SQL Server libraries to execute the BCP script. Then create a SSHOperator task in DAG file to connect to the EC2 instance and execute the script.

For non-pip dependency in MWAA, I suggest you to use custom plugins. For creating the custom plugins, you need to download the required packages as per your use-case and zip the file as plugins.zip. Please refer to these documentations to know more about how to create custom plugins,

  1. https://docs.aws.amazon.com/mwaa/latest/userguide/samples-oracle.html
  2. https://docs.aws.amazon.com/mwaa/latest/userguide/samples-hive.html

In case, if you have any other concerns or issues, please feel to reach back to us. We will be happy to assist you.

AWS
支持工程师
已回答 2 年前
  • Thank you. Another alternative is to use the Fargate Operator (container) to use bcp command. A bit more complicated but works.

0

This is now possible using startup scripts.

https://aws.amazon.com/blogs/big-data/whats-new-with-amazon-mwaa-support-for-startup-scripts/ https://docs.aws.amazon.com/mwaa/latest/userguide/using-startup-script.html

can do something like this to install SQL Server ODBC drivers: i think he had to do a rpm db update first?


#!/bin/sh


if [[ "${MWAA_AIRFLOW_COMPONENT}" != "webserver" ]]
then
	mwd=$(pwd)
	echo $mwd
	sudo uname -a
	sudo cat /etc/*release
	sudo curl -o msodbcsql18-18.0.1.1-1.x86_64.rpm https://packages.microsoft.com/rhel/7/prod/Packages/m/msodbcsql18-18.0.1.1-1.x86_64.rpm
	echo $mwd
	ls
	sudo ACCEPT_EULA=Y yum install msodbcsql18-18.0.1.1-1.x86_64.rpm -y
fi
Gabe
已回答 5 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则