Direkt zum Inhalt

re:Invent 2025 - Real-time insights for smart manufacturing with AWS Serverless

Lesedauer: 8 Minute
Inhaltsebene: Fortgeschrittene
0

Unplanned downtime costs the top 500 manufacturers $1.4 trillion per year, yet most factories already have the sensors and data to prevent it. This session shows how to connect that data, encode expert knowledge into a fine-tuned AI model, and deliver real-time guidance to every operator on the floor using AWS Serverless services.

Getting real-time insights from a smart manufacturing environment requires more than collecting data. Most factory floors generate sensor readings, machine alarms, and production logs continuously, yet that information lives in separate databases with no way to correlate it. When something fails, costs accumulate by the minute while operators search for a root cause across disconnected systems. Mohamed Salah, Senior Solutions Architect at AWS, addressed this directly at re:Invent 2025. In this post, we'll walk through the serverless architecture he describes, covering how to unify data from diverse sources, fine-tune a generative AI model on your operational knowledge, and put real-time guidance in the hands of every operator on your floor.

Three root causes driving production downtime

The scale of the problem is significant. The top 500 manufacturers globally lose an estimated $1.4 trillion per year to unplanned production stoppages, representing roughly 11% of total revenues. That figure is comparable to the GDP of Spain.

Mohamed identifies three interconnected root causes. The first is data silos. A typical production line has four or five machines, each with its own data sources and its own database. The data exists, but there is no correlation across it. You can see what happened on a single machine, but you cannot trace a problem from end to end across your full production line.

The second root cause is a skill gap. Experienced operators develop an intuitive understanding of their equipment. They can tell from the sound of a mixer that it is running too long, or recognize from a temperature pattern that one side of an oven runs hotter than the other. That knowledge is valuable, but it stays with individuals and is rarely captured or shared. Junior operators receive the same alerts as their senior colleagues but lack the context to act with confidence.

The third root cause compounds the first two. When neither the data nor the expertise is available in an integrated form, production delays and order backlogs follow. The challenge is not that the data does not exist. The challenge is that it is disconnected, and the knowledge to interpret it is not available when and where you need it.

Connecting your data and building a fine-tuned AI model

The architecture starts with a data foundation designed to absorb the full variety of inputs on your factory floor. Production data does not arrive through a single protocol. Modern equipment uses IoT (Internet of Things) protocols like MQTT, while legacy systems transfer documents over SFTP (Secure File Transfer Protocol), and other machines use proprietary formats. Maintaining separate integration pipelines for each source does not scale.

The solution uses the AWS Garnet Framework, an open-source framework built on top of AWS IoT Core. Garnet provides a single unified API that accepts connections from these diverse source types and normalizes incoming data into a standard model. It builds a graph database of relationships between your machines, sensors, and events across the production line. Instead of five isolated machine databases, you have one connected representation of everything happening on your floor.

AWS Lambda functions handle the transformation step, converting incoming telemetry into the NGSI (Next Generation Service Interface) standard format before writing to storage. Amazon Simple Queue Service (Amazon SQS) decouples ingestion from storage, buffering records before they land in Amazon Simple Storage Service (Amazon S3) as a central data lake. A parallel path through Amazon API Gateway enables write-back actions, so you can send commands back to your machines based on what the AI detects. Reducing mixer torque, adjusting oven temperature, or modifying process parameters can be triggered programmatically in response to real-time conditions.

With the data lake in place, multiple services draw from the same source. Amazon QuickSight connects to it for real-time dashboards. A digital twin of your production line gives you a visual replica of the floor showing current machine states, active alarms, and quality inspection results in one interface. What previously required navigating multiple screens now appears in a single view.

Connecting data solves the visibility problem. It does not close the skill gap. For that, the architecture incorporates a custom AI model fine-tuned on the specific knowledge of your production environment, including machine manuals, standard operating procedures (SOPs), compliance requirements, safety classifications, and historical records.

The fine-tuning pipeline begins with raw documents (PDFs, CSVs, and text files) uploaded to Amazon S3. AWS Step Functions orchestrates the preparation process, reading each document, invoking a foundation model to generate structured training examples, and producing a separate set reserved for testing. The structure matters: fine-tuning on unstructured text produces inconsistent results. Well-formatted prompt-and-response pairs, where each response includes the appropriate severity level, safety categorization, and step-by-step instructions, give the model the scaffolding it needs to respond correctly under production conditions.

Amazon SageMaker AI runs the fine-tuning job using instructed fine-tuning. Once training completes, the resulting model is imported directly into Amazon Bedrock as a custom model. This removes the need to manage inference infrastructure. You do not provision compute instances, configure scaling, or monitor resource utilization for model serving. The imported model is available through the same Bedrock API endpoint as other foundation models.

The Converse API in Amazon Bedrock is particularly useful for managing model versions over time. It maintains a consistent API signature regardless of which model ID you use. If a newer version introduces problems after a production update, rolling back requires only changing the model ID in your configuration, with no application code changes needed.

Real-time guidance for every operator on your floor

With the fine-tuned model available in Amazon Bedrock, your operators have two ways to interact with it.

The first is a conversational web application. Built with React and hosted on Amazon S3, the app uses Lambda functions to call the Bedrock Converse API. Operators can type or speak questions in natural language and receive structured, compliance-aware responses based on current telemetry and the fine-tuned operational knowledge. This gives every operator on your floor access to the same quality of diagnostic guidance that would otherwise require calling a senior specialist.

The second mode runs automatically on a five-minute cycle. The system feeds fresh telemetry from the data lake into the model and asks it to analyze conditions against the fine-tuned knowledge base. When the model identifies a developing issue, it generates a recommendation that accounts for severity, applicable compliance requirements, and the specific steps needed to resolve the situation. Amazon Simple Notification Service (Amazon SNS) delivers that recommendation directly to the relevant operator, who can then act through the API Gateway write-back path or follow the guided instructions in the chat application.

Visual quality inspection runs in parallel. A vision language model (VLM) deployed as a small language model sits inline on your production line, inspecting each batch of output after the tunnel oven. In the session example, the VLM classifies cookies as cracked, misshaped, or containing air pockets. A confirmed defect triggers the full Bedrock pipeline, pulling machine alarms, telemetry readings, the relevant SOP, and equipment manuals together to generate a complete root cause analysis and recommended corrective action. Your operator receives a notification within seconds, with context and next steps already assembled.

The full implementation described in the session, from data ingestion through fine-tuned model deployment to a working operator chatbot, was completed in seven days.

From first deployment to continuous improvement

The architecture is designed to grow more capable over time. Each operator interaction produces a feedback signal. A thumbs-up or thumbs-down rating on a model response feeds back into the fine-tuning pipeline, generating updated training data and a refined model version. Over time, the recommendations become more accurate and more specific to the actual conditions of your production environment.

Starting small is the recommended approach. A single production line and a modest set of documents is enough to build an initial model and see the full architecture working end to end. Adding more machines, more data sources, and more operational documents progressively improves both the data foundation and the model built on top of it.

The end result is a manufacturing operation where your most experienced operators' knowledge is encoded, available on demand, and growing more capable with every shift. Your floor no longer depends on the right expert being physically present. The real-time insights that drive smart manufacturing decisions are built into the system itself.

Watch the full session on YouTube: Real-time insights for smart manufacturing with AWS Serverless (CNS375)