Lambda function inside VPC cannot use client.post_to_connection | It works outside the VPC

0

OVERVIEW OF THE SYSTEM: I want to use the following configuration, however, I an open tu suggestions:

Client <-> API Gateway (Websocket) <-> Lambda <-> RDS DB (Postgre)

The idea is to create an API which I can call from my React application, which will access my database.

  • The application needs to read and write from the database (via the API).
  • The application needs to be scalable
  • The system needs to be secure
  • This is a learning project, but I am training to make it as close to a real application as possible

THE ISSUE: If the Lambda function is outside the VPC, it cannot access the database and I get a timeout error. If the Lambda function is inside the VPC, it can access the database but it cannot communicate repsonses back to the client - it throws this error:

[ERROR] ForbiddenException: An error occurred (ForbiddenException) when calling the PostToConnection operation: Forbidden
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 48, in lambda_handler
    response = client.post_to_connection(ConnectionId=connectionId, Data='response')
  File "/var/runtime/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/var/runtime/botocore/client.py", line 960, in _make_api_call
    raise error_class(parsed_response, operation_name)

SECURITY DETAILS:

  • My security group is set to allow all outbound traffic, but it is restricted to allow inbound traffic only on certain ports and from some IP addresses.
  • I did set up an internet gateway Enter image description here Enter image description here Enter image description here
import psycopg2
import json
import os
import json
import urllib3
import boto3
from datetime import datetime
from psycopg2.extras import RealDictCursor

client = boto3.client('apigatewaymanagementapi',endpoint_url=os.environ['END_POINT'])

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

def lambda_handler(event,context):
    body = json.loads(event["body"])

    if 'categories' not in body:
        body['categories'] = ['gender','publicPolicy']

    connection = psycopg2.connect(
    host=os.environ['HOST'],
    database=os.environ['DATABASE'],
    user=body["credentials"]["username"],
    password=body["credentials"]["password"]
    )
    print('Connected to database')
    categoriesString = ", ".join(["'{}'" for _ in body["categories"]]).format(*body["categories"])
    connectionId = event["requestContext"]["connectionId"]

    try:
        with connection.cursor(cursor_factory=RealDictCursor) as cursor:
            cursor.execute("SELECT * FROM questions WHERE category IN ("+categoriesString+");")
            results = cursor.fetchall()
    except Exception as e:
        print("Error occurred:", str(e))
    finally:
        connection.close()

    responseMessage = json.dumps(results,cls=DateTimeEncoder).encode('utf-8')
    print(responseMessage)
    print(connectionId,os.environ['END_POINT'])
    response = client.post_to_connection(ConnectionId=connectionId, Data='response')

    return {"statusCode":200}
3 Answers
5
Accepted Answer

Based on your question, I'd split this problem in two parts:

  1. Lambda being outside VPC, can't access Database: This is good and expected behavior too. I'd suggest you to follow this step-by-step guide How do I configure a Lambda function to connect to an RDS instance?
  2. Lambda function being inside VPC, can access the database but itcannot communicate responses back to the client - it throws the error. :

Change client = boto3.client('apigatewaymanagementapi',endpoint_url=os.environ['END_POINT']) to client = boto3.client('apigatewaymanagementapi', endpoint_url='https://{api-id}.execute-api.{region}.amazonaws.com/{stage}')

Follow Boto3 issue here for more details.

>>>EDIT<<<

Are you looking to connect to RDS that's inside VPC from lambda function, which is outside VPC, traffic won't go through this way as there is no direct endpoint, which could let lambda(outside VPC) connect to RDS(inside VPC). I understand why are you looking for that as other part APIGW doesn't work when you have your lambda function within VPC. To get RDS connectivity with Lambda, you should have Lambda function within VPC to communicate with RDS preferable both lambda and RDS in private subnet.

As long as APIGW connectivity to lambda within VPC is talked about, can you make sure of following:

Is your function in private subnet? Is NAT Gateway configured in a public subnet in that VPC? Do you have a routing table with 0.0.0.0/0 pointing to the NAT Gateway in that private subnet? When you bring your lambda function with VPC, I assume private subnet, can you make sure your private subnet has internet connectivity. From problem description, it seems that lambda function is accessible to APIGW as lambda fucntion anyway gets invoked via lambda service public endpoint but your lambda function may not have internet connectivity due to subnet misconfiguration.

Hope you find this helpful.

profile pictureAWS
EXPERT
answered a year ago
profile picture
EXPERT
reviewed 5 months ago
profile pictureAWS
EXPERT
iBehr
reviewed a year ago
  • Yes, you can split this question into two different problems. I just wanted to give all the context.

    1. I did check that link, this is what it says: "A Lambda function that's outside of a VPC can't access an RDS instance that's inside a VPC."
    2. I don't think the environment variable is the problem. What you wrote there is exactly what is saved in the environment variable - it works when it is outside the VPC, so this is not the problem. You probably got that solution from StackOverflow - they had a different problem there, I think they were missing the stage or the endpoint altogether.
  • Are you looking to connect to RDS that's inside VPC from lambda function, which is outside VPC, traffic won't go through this way as there is no direct endpoint .which would let lambda(outside VPC) connect to RDS(inside VPC). I understand why are you looking for that as other part APIGW doesn't work when you have your lambda function within VPC. To get RDS connectivity with Lambda, you should have Lambda function within VPC to communicate with RDS preferable both lambda and RDS in private subnet.

    As long as APIGW connectivity to lambda within VPC is talked about, can you make sure of following:

    Is your function in private subnet? Is NAT Gateway configured in a public subnet in that VPC? Do you have a routing table with 0.0.0.0/0 pointing to the NAT Gateway in that private subnet? When you bring your lambda function with VPC, I assume private subnet, can you make sure your private subnet has internet connectivity. From problem description, it seems that lambda function is accessible to APIGW as lambda fucntion anyway gets invoked via lambda service public endpoint but your lambda function may not have internet connectivity due to subnet misconfiguration.

  • I believe this would work if you follow the suggestions provided in edit section. I don’t see any reason of not working, if it’s setup this way(private subnet with NATGW).

  • Thanks for the pointers. I did have some issues in my configuration. Now the configuration is fixed and I get a different error when trying to communicate with the client: Task timed out after 10.01 seconds

    In summary, these are the changes I implemented:

    1. Function is connected to a single private subnet.
    2. A route table routes all traffic from the private subnet to the NAT Gateway.
    3. The NAT Gateway is in a public subnet.
    4. The security group of the function allows all outbound traffic

    UPDATE: I had 4 extra subnets without endpoints. I just added endpoints for all of them them (half public via an IGW and half private via the same NATGW) and now it works! I am not sure if this was the issue. It seems off because the function was not connected to these subnets in any way. These were the default subnets in my default AWS VPC.

  • Glad that it worked out. That's exactly what I asked after having conversation with you here. Yes vpc endpoint would have helped for API gateway communication.

0

When your Lambda function is not attached to a VPC (the default), it has access to all public APIs (e.g., post_to_connection), but has no access to private resources in your VPC (e.g. RDS). When you attach the function to the VPC, it is the other way around. It has access to private resources (assuming the security groups and routing tables are configured correctly), but has no access to public APIs.

If you need both, you need to attach the function to the VPC and give it a way to talk with public APIs. To o that there are two options: 1. Configure a NAT Gateway in a public subnet, and route the internet traffic via the Gateway. 2. For supporting AWS services, use VPC Endpoints, which let you talk with those specific services. In your case, the API Gateway VPC endpoint should probably work.

profile pictureAWS
EXPERT
Uri
answered a year ago
  • Thanks for the pointers. The problem before was that the function was not on a private subnet. Now the configuration is fixed and I get a different error when trying to communicate with the client: Task timed out after 10.01 seconds - it seems that it is still a connectivity issue.

    UPDATE: I had 4 extra subnets without endpoints. I just added endpoints for all of them them (half public via an IGW and half private via the same NATGW) and now it works! I am not sure if this was the issue. It seems off because the function was not connected to these subnets in any way. These were the default subnets in my default AWS VPC.

0

I’m sure it would work if your lambda function is on a private subnet with a route to a NAT gateway. Lambda can’t access outside via an IGW. It require a NAT gateway.

If so review I’d that’s secure enough for you or so you need to lock it down a little more by using endpoints instead.

profile picture
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions