Lambda Docker Image with Puppeteer: Chinese Characters Not Displaying in Screenshots

0

Issue

I'm experiencing an issue with Chinese character rendering in a Lambda function deployed as a Docker image. The function uses Puppeteer and headless Chrome to take screenshots of web pages.

Environment

  • AWS Lambda with Docker image deployment
  • Puppeteer and headless Chrome for web page screenshots
  • Function that opens a web page and captures a screenshot
  • I use @sparticuz/chromium and puppeteer-core in my code
  • Here is my dockerfile:
FROM public.ecr.aws/lambda/nodejs:22

RUN dnf install -y nspr nss google-noto-sans-cjk-fonts fontconfig
RUN fc-cache -fv

# Copy function code
COPY index.js ${LAMBDA_TASK_ROOT}
COPY package.json ${LAMBDA_TASK_ROOT}

RUN npm install .

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "index.handler" ]

Problem Details

When I run my Docker container locally and test the screenshot functionality, Chinese characters display correctly in the captured images. However, after deploying the same Docker image to AWS Lambda, the screenshots show blank spaces or boxes where Chinese characters should appear.

What I've Tried

The exact same Docker image works perfectly on my local machine, which suggests this might be related to the Lambda execution environment rather than my code or Docker configuration.

Questions

  1. Has anyone encountered similar issues with Chinese font rendering in headless Chrome on Lambda?
  2. Are there specific font packages or configurations needed for the Lambda environment to properly render Chinese characters?
  3. Could this be related to the Lambda execution environment's language settings or locale configuration?

Any insights or suggestions would be greatly appreciated!

profile pictureAWS
asked 2 months ago66 views
1 Answer
0

Hello Allen, thank you for posting this question. I think we can help you overcome this obstacle. Let's unpack the issue:

First off, the problem likely stems from font handling in the Lambda environment. Here's a comprehensive solution approach:

Update Dockerfile First, let's modify your Dockerfile to ensure proper font installation and configuration:

FROM public.ecr.aws/lambda/nodejs:22

Install required packages and fonts

RUN dnf install -y
nspr
nss
google-noto-sans-cjk-fonts
google-noto-serif-cjk-fonts
fontconfig
liberation-fonts
&& dnf clean all

Configure fonts and update cache

RUN fc-cache -fv

Set locale

ENV LANG=en_US.UTF-8 ENV LC_ALL=en_US.UTF-8

Copy function code

COPY index.js ${LAMBDA_TASK_ROOT} COPY package.json ${LAMBDA_TASK_ROOT}

RUN npm install

CMD [ "index.handler" ]

Puppeteer Configuration Modify your Puppeteer setup to explicitly specify font preferences:

const browser = await puppeteer.launch({ args: [ '--no-sandbox', '--disable-setuid-sandbox', '--font-render-hinting=medium', '--enable-font-antialiasing', '--disable-gpu', '--default-font-family=Noto Sans CJK SC' ], defaultViewport: { width: 1920, height: 1080 } });

Font Verification Add this debugging code to verify font availability:

const page = await browser.newPage(); await page.evaluate(() => { return document.fonts.ready.then(() => { const fonts = document.fonts.check('12px "Noto Sans CJK SC"'); console.log('Font availability:', fonts); }); });

Additional Considerations

Memory: Ensure your Lambda has sufficient memory (at least 1024MB) for font processing
Timeout: Set an appropriate timeout for your function (30 seconds minimum recommended)
Layer size: Check if your deployment package size is within limits

Lambda Configuration Make sure your Lambda function has these environment variables:

LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8

Testing Script Here's a complete test script to verify the setup:

const puppeteer = require('puppeteer-core'); const chromium = require('@sparticuz/chromium');

exports.handler = async (event) => { let browser = null;

try { browser = await puppeteer.launch({ args: chromium.args, defaultViewport: chromium.defaultViewport, executablePath: await chromium.executablePath, headless: chromium.headless, ignoreHTTPSErrors: true, });

const page = await browser.newPage();

// Force font loading
await page.setContent(`
  <html>
    <head>
      <style>
        body { font-family: 'Noto Sans CJK SC', sans-serif; }
      </style>
    </head>
    <body>
      <h1>测试中文</h1>
    </body>
  </html>
`);

// Wait for fonts to load
await page.evaluate(() => document.fonts.ready);

const screenshot = await page.screenshot();

return {
  statusCode: 200,
  body: screenshot.toString('base64'),
  isBase64Encoded: true
};

} catch (error) { console.error('Error:', error); throw error; } finally { if (browser !== null) { await browser.close(); } } };

Troubleshooting If issues persist:

Check Lambda logs for font-related errors
Verify font installation in the container using fc-list
Try different Chinese fonts (e.g., WenQuanYi Micro Hei)
Consider using a custom runtime with pre-installed fonts

Also, remember to:

Keep your Lambda warm to avoid cold starts affecting font loading
Monitor memory usage during font processing
Test with various Chinese character sets to ensure comprehensive support

This solution should resolve the Chinese character rendering issues in your Lambda function. Please feel free to respond here if the issue persists.

Thank you for using AWS!

Brian

profile pictureAWS
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions