Python Lambda unable to retrieve file from FTP

0

I have written a python lambda which connects to an industrial HMI FTP Server. This FTP server software is very old and does not have PASV capabilities. I need to retrieve the data log files off of it in order to create a glue catalog from the files and begin creating reporting. I think that Lambda is unable to receive the file due to a port call to the IP address that the Lambda has not being able to receive TCP connections. So, I am looking for some help to solve this or options for another service that can receive a TCP connection.

I have considered AWS Glue, but it might have the same inbound TCP connection issue. I have considered Fargate ECS, but I am really unfamiliar with setting up containers / tasks. I looked for any step by step guides and wasn't able to find anything.

I am open to any suggestions of any other potential service options.

Mav
asked 9 months ago288 views
3 Answers
0

Unfortunately, this is the only option with these particular industrial panels. It is FTP or an Event that sends an E-Mail with a file attached. Dealing with email attached files sounds worse to me than the other options. That said, there was another project I am also working on that involves a RaspberryPi device which will be deployed in the same locations that this FTP server is physically at. The plan is to transition this script to the Pi and have it get the file locally and then send to S3. Otherwise, I was considering a networking Fargate Task on ECS or the smallest EC2 instance.

Mav
answered 9 months ago
0

FTP is what I call a "complex" protocol - in this case that is because it opens multiple ports (and can do in different "directions") to do its work. It maintains a control connection on TCP port 21 and uses TCP port 20 to transfer data (it opens a new TCP session for each file). What you're seeing is that the data connection is being blocked within the Lambda network. So you're correct - this is why it doesn't work in Lambda.

If FTP is the only option, the solution is to use a service where your application can be directly assigned a public/elastic IP address. EC2 is perhaps the simplest way to do this.

Much better would be to find an alternative - even transferring data across SSH or telnet (if you must). I wouldn't normally recommend telnet because it is unencrypted but you're already using FTP which transfers usernames, passwords and data in the clear anyway.

Edit given other comment: Using email is not terrible - you can use Simple Email Service to receive email; it gets stored in S3 then you can process it with a Lambda function. As a "push" operation it is normally better (from a security perspective) because the remote site does not have to open inbound ports to allow a connection from the "outside" in order to transfer files.

Same goes for putting a small host (such as a Raspberry Pi - but it can be anything) to push the files to S3.

profile pictureAWS
EXPERT
answered 9 months ago
0

Please note that by default, Lambda runs your functions in a secure internal VPC with access to AWS services and the internet. Lambda owns this VPC, which isn't connected to your account's default VPC. When you connect a function to a VPC in your account, the function can't access the internet unless your VPC provides access.

So incase if you are connecting the Lambda function to a VPC from your AWS account and if your function needs internet access, use network address translation (NAT). Connecting a function to a public subnet doesn't give it internet access or a public IP address. Kindly follow the document[1] to know the detailed steps of providing internet access to your Lambda function using your AWS account’s VPC.

In general, there is no restriction on Lambda trying to access a FTP server, I found an example too which might be helpful to you: [+] https://github.com/orasik/aws_lambda_ftp_function

You can also further enable CloudWatch logs for your function and leverage these logs to troubleshoot the issue further. Please check out the documentation[2] to know more details regarding the same.

Please feel free to reply if you have further queries.

References:

[1] How do I give internet access to a Lambda function that's connected to an Amazon VPC? https://repost.aws/knowledge-center/internet-access-lambda-function

[2] Monitoring and troubleshooting Lambda functions https://docs.aws.amazon.com/lambda/latest/dg/lambda-monitoring.html

answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions