MWAA - can I use external programs (non-python) - sqlcmd or bcp?

0

Can i install/deploy a separate tool like sqlcmd or bcp into mwaa? these are linux tools to help move data around on MS SQL Server. we use this on-prem in our Airflow linux boxes but don't know if this is supported in MWAA.

Basically, i need to install non-pip dependency on MWAA

https://docs.microsoft.com/en-us/sql/tools/sqlcmd-utility?view=sql-server-ver16 https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-migrate-bcp?view=sql-server-ver16

asked 2 years ago506 views
2 Answers
1
Accepted Answer

Hi,

Please note, you can run bash commands in MWAA instance using BashOperator task. However, installing mysql-tools requires sudo command and in MWAA, running sudo commands is not allowed.

You can try the following methods if they are suitable to you:

  1. You can use RDS MySQL server with MWAA and establish connection between them. For more information, kindly check this documentation https://docs.aws.amazon.com/mwaa/latest/userguide/samples-sql-server.html . To create Apache Airflow v2 connection, please follow the “Create a new Apache Airflow connection“ section in https://docs.aws.amazon.com/mwaa/latest/userguide/samples-ssh.html#samples-ssh-connection

  2. You can also create an Airflow task that connects to an EC2 instance and execute a BCP command/script in that instance. For this, you need to create an EC2 instance in public subnet and setup with the required SQL Server libraries to execute the BCP script. Then create a SSHOperator task in DAG file to connect to the EC2 instance and execute the script.

For non-pip dependency in MWAA, I suggest you to use custom plugins. For creating the custom plugins, you need to download the required packages as per your use-case and zip the file as plugins.zip. Please refer to these documentations to know more about how to create custom plugins,

  1. https://docs.aws.amazon.com/mwaa/latest/userguide/samples-oracle.html
  2. https://docs.aws.amazon.com/mwaa/latest/userguide/samples-hive.html

In case, if you have any other concerns or issues, please feel to reach back to us. We will be happy to assist you.

AWS
SUPPORT ENGINEER
answered 2 years ago
  • Thank you. Another alternative is to use the Fargate Operator (container) to use bcp command. A bit more complicated but works.

0

This is now possible using startup scripts.

https://aws.amazon.com/blogs/big-data/whats-new-with-amazon-mwaa-support-for-startup-scripts/ https://docs.aws.amazon.com/mwaa/latest/userguide/using-startup-script.html

can do something like this to install SQL Server ODBC drivers: i think he had to do a rpm db update first?


#!/bin/sh


if [[ "${MWAA_AIRFLOW_COMPONENT}" != "webserver" ]]
then
	mwd=$(pwd)
	echo $mwd
	sudo uname -a
	sudo cat /etc/*release
	sudo curl -o msodbcsql18-18.0.1.1-1.x86_64.rpm https://packages.microsoft.com/rhel/7/prod/Packages/m/msodbcsql18-18.0.1.1-1.x86_64.rpm
	echo $mwd
	ls
	sudo ACCEPT_EULA=Y yum install msodbcsql18-18.0.1.1-1.x86_64.rpm -y
fi
Gabe
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions