Airflow webserver not installing python requirements

2

The Airflow 2.2.2 webserver on MWAA is not installing the packages in the requirements.txt. The packages install just fine on the workers and scheduler; but not the webserver. pip fails with a connection timeout error.

Here's the from the CloudWatchrequirements_install_* log stream for the webserver

WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7ff05a174050>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/apache-airflow-providers-mysql/
ERROR: Could not find a version that satisfies the requirement apache-airflow-providers-mysql==2.1.1 (from versions: none)

I didn't run into this issue when I setup MWAA in my dev VPC so I'm not sure what's going on here; I've run the MWAA verify environment support tool but it didn't flag any configuration issues and now I'm at my wit's end.

Any insight would be greatly appreciated!

5개 답변
1

Outbound internet access was removed from the private web server option.

If you wish to continue to install requirements from public repositories such as http://pypi.org on the webserver that is private, you may do so by downloading and packaging Python WHL files in plugins.zip, or changing your webserver to public.

답변함 2년 전
  • Is there any way to enable this? It seems like a regression to me because in many disconnected environments, folks host their own PyPI repositories. We already have a cordoned off MWAA instance behind an AWS network firewall and are explicitly allowing outbound connection to our private PyPI repository.

  • Was this mentioned in a changelog somewhere? And if so, was the fact that our previous deployment that had the Private network enabled, but was able to install remote dependencies over the Internet, a security bug?

0

It turns out the issue may have been due to the webserver being in private mode. After switching it to private mode, the package installation succeeded. I think when private network mode is selected for the Airflow webserver, the AWS-managed service VPC in which it runs does not also allow outbound Internet access - hence the connection timeouts

답변함 2년 전
  • Hi, we are trying to setup MWAA in private mode. Even for us the same issue is happening with python dependency installation. The pypi.org is accessible from other EC2 on the same subnet so there should not be any network/FW issues. Does anyone has any idea what might be wrong or what other configuration may be needed to make this work? Happy to provide additional details if required.

0

I have the same problem, MWAA woudn't update totally! after updating requrements.txt file. My webserver access is set to public but it doesn't work!

Keri
답변함 10달 전
0

Perhaps we don't need the packages installed on the Airflow Web Server, only on the Workers and Schedulers, so the pip install timeouts on the web server could probably be ignored. Unless we need some Airflow UI plugins installed. Has anyone verified this? Does the Web Server also need all the packages in the requirements.txt file installed? Surely the Scheduler and Workers do, but the Web Server?

Dan
답변함 6달 전
0

Faced the same issue with version 2.2.2 . After investigating further, I found that webserver is using python -m pip install -r requirements.txt. While the rest (worker and scheduler) are using pip install -r requirements.txt.

Basudev
답변함 6달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠