Automated S3 Content Library Indexing with AWS Lambda

6 minute read
Content level: Advanced
3

Demonstrates a method for replacing the manual VMware Content Library indexing process with automation

Introduction

Content libraries are container objects for VM and vApp templates and other types of files, such as ISO images. A standard content library is hosted by a vCenter Server. For normal production use, this works just fine as your vCenter server is always kept running. Many customers and partners run lab or demo environments that are frequently destroyed and recreated. In these situations, admins often turn to S3 to host a content library that persists across environment teardowns. Although this is not an officially supported by VMware, it is a widely used technique in the community.

This article demonstrates how to use the the manual method for indexing S3 Content Libraries first, then goes on to demonstrate the automated method.

Prerequisites

Whether you use the manual or the automated method, you need a bucket and a folder structure.

Create Bucket

For this example I created kremerpt-content-library in us-east-2

Configure the bucket policy so your vCenter can make https calls to the files in the bucket. This example opens the S3 bucket to a single public IP address

{
    "Version": "2012-10-17",
    "Id": "S3PolicyIPRestrict",
    "Statement": [
        {
            "Sid": "IPAllow",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::kremerpt-content-library/*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "18.154.107.224/32"
                }
            }
        }
    ]
}

Create S3 folder structure

You can create separate libraries under a single bucket. Put objects in their own folder, do not put any objects in the root folder of the library. This example shows a folder named lib1 with a single iso folder, with a single ISO uploaded to the folder.

Library 1 Structure

Manual Indexing Method

Create an IAM user

The user must have write access to the S3 bucket. Generate an access key ID and secret for this user.

Install the AWS CLI

Using aws configure, set the default region to the region hosting the bucket, and input the access key ID and secret.

Configure Python

Python installation is required. The indexing script also requires the boto3 package.

On Mac/Linux

pip install boto3

On Windows

python -m pip install boto3

Download the make_vcsp_2018.py script from William Lam's Python repo.

Run Script

The --name property is the name of the content library, and --path is the S3 bucket name and folder

python .\make_vcsp_2018.py --name kremerpt-content-library --type s3 --path kremerpt-content-library/lib1

Check Output

The script should have generated index files lib.json and items.json

Generated JSON files

Subscribe

Create a new, subscribed library using the full HTTPS path to lib.json https://kremerpt-content-library.s3.us-east-2.amazonaws.com/lib1/lib.json

Subscribed CL

The ISO file should show up in the content library.

ISO file found

Automated Indexing Method

Manually running the script grows tiresome if you upload frequently. You can use AWS Lambda to react to an event anytime the S3 bucket is changed and automatically invoke the script. I contributed the sample_lambda_function_for_make_vcsp_2018.py script to William Lam's Python repo. You can use this sample script along with make_vcsp_2018.py for your Lambda function.

Create IAM policy

First, create a policy document in IAM > Access management > Policies

We need Lambda to be able to GET, PUT, LIST, and DELETE operations on the bucket. I named this one s3-kremerpt-content-library-write

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "s3write",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:GetObjectAttributes",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::kremerpt-content-library",
                "arn:aws:s3:::kremerpt-content-library/*"
            ]
        }
    ]
}

Create Role

Create a role in IAM > Access Management > Roles. I named this one Content-Library-Lambda. Attach the policy you created above, as well as the 2 existing AWS managed policies AWSLambdaBasicExecutionRole and AmazonS3ObjectLambdaExecutionRolePolicy

Lambda Role

Create Function

In Lambda > Functions, click Create Function

Lambda Function

Name the function - I named it Content-Library. Pick Python for the Runtime. Select Use an existing role, then select the role you created above. I select ContentLibraryLambda

Lambda Function

Add function code

Delete the existing Hello sample code

Existing Python

Create a new file

New File

The name for this file can be anything .py. However, if you do not use the default lambda_function.py, you must configure the runtime settings to point to the file. I used the default lambda_function.py. Paste the code from sample_lambda_function_for_make_vcsp_2018.py into this file and save it.

Lambda Handler

Create another new file make_vcsp_2018.py - this file must be named exactly as shown. Paste the code from make_vcsp_2018.py into this new file.

Make VCSP

Configure Function

Click on the Configuration tab

Configuration tab

General Configuration

Note: you may need to increase some of these settings if you have a large number of files in your content library.

General settings

Triggers

You need to add 2 triggers - one when files are created in the S3 bucket, and one when files are deleted.

Add trigger

Select S3 from the first dropdown, select the bucket name in the Bucket dropdown, select an Event type of All object create events and check the I acknowledge box at the bottom, then click Add

Create events trigger

Add a second trigger, keep everything the same as the first trigger, except select All object delete events for the Event type.

Delete events trigger

Note: as the warnings shows you, it is best practice to have a source and destination bucket when working with S3 triggers for Lambda. This is not possible with the content library, as the make_vcsp_2018 function is designed to write the required content library index files into the same bucket. Logic to avoid recursive calls is built into the Lambda handler. A more scalable method to avoid this recursion problem could be using Event Bridge, which is better suited to filtering than embedding the filtering code directly in Lambda.

Role

The execution role should be the custom role you set up earlier.

Lambda role

Customize and deploy function

Go back to the code tab. Customize the make_vcsp_s3 function call in the content library handler to your environment

Lambda code

The template shows you which arguments to change and also explains all of the arguments in the comments

 make_vcsp_2018.make_vcsp_s3('REPLACE-ME','REPLACE-ME',False,'REPLACE-ME')

Customize the arguments to your environment, mine looks like this:

 make_vcsp_2018.make_vcsp_s3('kremerpt-content-library','kremerpt-content-library/lib1',False,'us-east-2')

Save the file, then click Deploy.

Lambda deploy

The function should deploy succesfully.

Lambda deploy success

Test

Upload files to the library folder of the S3 bucket. I uploaded some .iso files as well as a VM template. After uploading, I see that the Lambda function has generated the required .json files.

Indexes

Subscribe to the content library using the full URL to lib.json. In this case: https://kremerpt-content-library.s3.us-east-2.amazonaws.com/lib1/lib.json

Subscribe to S3 Content Library

Verify you can see the files that you uploaded. I see that all of the ISO files and the one template that I uploaded are available for use!

Content Library - ISO

Content Library - Template

profile pictureAWS
EXPERT
published 2 years ago2430 views