Learn Agentic AI by Building One — A Hands-On Guide - Lesson 7: A Real Agent — Multiple Tools, Reasoning, and Autonomy
Introduction to how Agentic AI works behind the scene
Now let's put it all together into a proper agent that:
- Has multiple diverse tools (data queries, statistics, time calculations)
- Chains tool calls to solve complex questions
- Maintains conversation memory across questions
- Shows its reasoning
This is essentially how tools like Kiro, GitHub Copilot agents, and enterprise AI assistants work under the hood.
┌─────────────────────────────────────────────────┐
│ THE AGENTIC AI STACK │
│ │
│ ┌───────────┐ System prompt = personality │
│ │ PERSONA │ (Lesson 2) │
│ └─────┬─────┘ │
│ ↓ │
│ ┌───────────┐ Message history = context │
│ │ MEMORY │ (Lesson 4) │
│ └─────┬─────┘ │
│ ↓ │
│ ┌───────────┐ Functions the AI can call │
│ │ TOOLS │ (Lesson 5) │
│ └─────┬─────┘ │
│ ↓ │
│ ┌───────────┐ Think → Act → Observe → Repeat │
│ │ LOOP │ (Lesson 6) │
│ └─────┬─────┘ │
│ ↓ │
│ AGENT │
└─────────────────────────────────────────────────┘
The Tools
A realistic set of capabilities — search, statistics, resolution time calculations, and a date tool:
import json import datetime import boto3 bedrock = boto3.client("bedrock-runtime", region_name="us-east-1") CASES_DB = [ {"id": "CASE-001", "subject": "EC2 Instance Rebooted", "severity": "Urgent", "service": "EC2", "account": "account1-production", "created": "2025-12-31", "resolved": "2026-01-10", "owner": "engineer1"}, {"id": "CASE-002", "subject": "RDS Connection Timeout", "severity": "Urgent", "service": "RDS Proxy", "account": "account2-production", "created": "2025-12-30", "resolved": "2026-03-08", "owner": "engineer2"}, {"id": "CASE-003", "subject": "EFS Throughput High", "severity": "Urgent", "service": "EFS", "account": "account3-production", "created": "2025-12-30", "resolved": "2026-01-09", "owner": "engineer3"}, {"id": "CASE-004", "subject": "S3 GuardDuty Finding", "severity": "High", "service": "S3", "account": "account4-production", "created": "2025-12-16", "resolved": "2025-12-20", "owner": "engineer2"}, {"id": "CASE-005", "subject": "Lambda Memory Limit", "severity": "Low", "service": "Lambda", "account": "account5-innovation", "created": "2025-12-16", "resolved": "2025-12-28", "owner": "engineer3"}, {"id": "CASE-006", "subject": "EKS Pods Failing", "severity": "Normal", "service": "EKS", "account": "account3-innovation", "created": "2025-12-30", "resolved": "2026-01-10", "owner": "engineer3"}, {"id": "CASE-007", "subject": "MSK Cluster Stuck in Upgrade", "severity": "Low", "service": "MSK", "account": "account2-non-production", "created": "2025-12-22", "resolved": "2026-01-07", "owner": "engineer4"}, {"id": "CASE-008", "subject": "Aurora Failover RCA", "severity": "Low", "service": "Aurora", "account": "account2-production", "created": "2025-12-19", "resolved": "2026-01-23", "owner": "engineer4"}, {"id": "CASE-009", "subject": "IAM Access Key Investigation", "severity": "High", "service": "IAM", "account": "account4-production", "created": "2025-12-16", "resolved": "2025-12-26", "owner": "engineer5"}, {"id": "CASE-010", "subject": "Bedrock CloudWatch Metrics Missing", "severity": "Normal", "service": "Bedrock", "account": "account3-innovation", "created": "2025-12-16", "resolved": "2026-01-01", "owner": "engineer5"}, ] def search_cases(severity: str = None, service: str = None, account: str = None) -> str: """Search and filter support cases.""" cases = CASES_DB if severity: cases = [c for c in cases if c["severity"].lower() == severity.lower()] if service: cases = [c for c in cases if service.lower() in c["service"].lower()] if account: cases = [c for c in cases if account.lower() in c["account"].lower()] return json.dumps(cases, indent=2) def get_statistics(field: str) -> str: """Get aggregated statistics: count by severity, service, account, or owner.""" counts = {} for c in CASES_DB: key = c.get(field, "unknown") counts[key] = counts.get(key, 0) + 1 sorted_counts = dict(sorted(counts.items(), key=lambda x: x[1], reverse=True)) return json.dumps(sorted_counts, indent=2) def calculate_resolution_time(case_id: str = None, service: str = None, severity: str = None) -> str: """Calculate resolution time in days for a specific case or average by service/severity.""" if case_id: case = next((c for c in CASES_DB if c["id"] == case_id), None) if not case: return f"Case {case_id} not found" days = (datetime.date.fromisoformat(case["resolved"]) - datetime.date.fromisoformat(case["created"])).days return f"Case {case_id}: {days} days to resolve" cases = CASES_DB if service: cases = [c for c in cases if service.lower() in c["service"].lower()] if severity: cases = [c for c in cases if c["severity"].lower() == severity.lower()] if not cases: return "No matching cases" times = [(datetime.date.fromisoformat(c["resolved"]) - datetime.date.fromisoformat(c["created"])).days for c in cases] return json.dumps({"avg_days": round(sum(times)/len(times), 1), "min_days": min(times), "max_days": max(times), "case_count": len(times)}) def get_current_date() -> str: """Get today's date.""" return datetime.date.today().isoformat() TOOL_FUNCTIONS = { "search_cases": search_cases, "get_statistics": get_statistics, "calculate_resolution_time": calculate_resolution_time, "get_current_date": get_current_date, }
The Tool Descriptions and System Prompt
TOOLS = [ {"name": "search_cases", "description": "Search support cases. Filter by severity, service, or account name.", "input_schema": {"type": "object", "properties": {"severity": {"type": "string"}, "service": {"type": "string"}, "account": {"type": "string"}}, "required": []}}, {"name": "get_statistics", "description": "Get case counts grouped by a field: severity, service, account, or owner", "input_schema": {"type": "object", "properties": {"field": {"type": "string", "enum": ["severity","service","account","owner"]}}, "required": ["field"]}}, {"name": "calculate_resolution_time", "description": "Calculate resolution time. Provide case_id for one case, or filter by service and/or severity for averages.", "input_schema": {"type": "object", "properties": {"case_id": {"type": "string"}, "service": {"type": "string"}, "severity": {"type": "string", "enum": ["Urgent","High","Normal","Low"]}}, "required": []}}, {"name": "get_current_date", "description": "Get today's date", "input_schema": {"type": "object", "properties": {}, "required": []}}, ] SYSTEM_PROMPT = """You are an expert support operations analyst agent. You help teams understand their support case data. You have tools to search cases, get statistics, and calculate resolution times. Use them to give accurate, data-driven answers. When answering: - Always use tools to get real data — never guess - If a question needs multiple data points, call multiple tools - Show your reasoning briefly - Be concise and actionable"""
The Full Agent Class
This combines everything — memory (Lesson 4), tools (Lesson 5), and the agent loop (Lesson 6):
class Agent: """A complete agent with memory, tools, and an autonomous loop.""" def __init__(self): self.messages = [] # Conversation memory def run(self, user_input: str) -> str: print(f"🧑 You: {user_input}") self.messages.append({"role": "user", "content": user_input}) step = 0 while step < 10: step += 1 response = bedrock.invoke_model( modelId="us.anthropic.claude-sonnet-4-20250514-v1:0", contentType="application/json", accept="application/json", body=json.dumps({ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 2048, "system": SYSTEM_PROMPT, "tools": TOOLS, "messages": self.messages, }), ) result = json.loads(response["body"].read()) # Done — extract final answer if result["stop_reason"] == "end_turn": text_parts = [b["text"] for b in result["content"] if b["type"] == "text"] final = "\n".join(text_parts) self.messages.append({"role": "assistant", "content": result["content"]}) print(f"\n🤖 Agent ({step} step{'s' if step > 1 else ''}):\n{final}") return final # Tool calls needed self.messages.append({"role": "assistant", "content": result["content"]}) tool_results = [] for block in result["content"]: if block["type"] == "tool_use": name, args = block["name"], block["input"] print(f" 🔧 Step {step}: {name}({json.dumps(args)})") output = TOOL_FUNCTIONS[name](**args) print(f" → {output[:120]}{'...' if len(output) > 120 else ''}") tool_results.append({"type": "tool_result", "tool_use_id": block["id"], "content": output}) self.messages.append({"role": "user", "content": tool_results}) return "Reached max steps."
Running the Agent
agent = Agent()
Question 1: Overview
agent.run("Give me a quick overview of our case distribution by severity.")
Output:
🧑 You: Give me a quick overview of our case distribution by severity.
🔧 Step 1: get_statistics({"field": "severity"})
→ {
"Urgent": 3,
"Low": 3,
"High": 2,
"Normal": 2
}🤖 Agent (2 steps):
Here's your case distribution by severity:
- Urgent: 3 cases (30%)
- Low: 3 cases (30%)
- High: 2 cases (20%)
- Normal: 2 cases (20%)
Total: 10 cases
Key insights:
- You have an equal split between Urgent and Low severity cases (60% combined)
- High and Normal severity cases each represent 20% of your caseload
- The high proportion of Urgent cases (30%) may indicate either genuine critical issues or potential mis-categorization that should be reviewed
Question 2: Worst resolution time
agent.run("Which service takes the longest to resolve? Show me the numbers.")
Output:
🧑 You: Which service takes the longest to resolve? Show me the numbers.
🔧 Step 1: get_statistics({"field": "service"})
→ {
"EC2": 1,
"RDS Proxy": 1,
"EFS": 1,
"S3": 1,
"Lambda": 1,
"EKS": 1,
"MSK": 1,
"Aurora": 1,
"IAM": 1,...
🔧 Step 2: calculate_resolution_time({"service": "EC2"})
→ {"avg_days": 10.0, "min_days": 10, "max_days": 10, "case_count": 1}
🔧 Step 2: calculate_resolution_time({"service": "RDS Proxy"})
→ {"avg_days": 68.0, "min_days": 68, "max_days": 68, "case_count": 1}
🔧 Step 2: calculate_resolution_time({"service": "EFS"})
→ {"avg_days": 10.0, "min_days": 10, "max_days": 10, "case_count": 1}
🔧 Step 2: calculate_resolution_time({"service": "S3"})
→ {"avg_days": 4.0, "min_days": 4, "max_days": 4, "case_count": 1}
🔧 Step 2: calculate_resolution_time({"service": "Lambda"})
→ {"avg_days": 12.0, "min_days": 12, "max_days": 12, "case_count": 1}
🔧 Step 3: calculate_resolution_time({"service": "EKS"})
→ {"avg_days": 11.0, "min_days": 11, "max_days": 11, "case_count": 1}
🔧 Step 3: calculate_resolution_time({"service": "MSK"})
→ {"avg_days": 16.0, "min_days": 16, "max_days": 16, "case_count": 1}
🔧 Step 3: calculate_resolution_time({"service": "Aurora"})
→ {"avg_days": 35.0, "min_days": 35, "max_days": 35, "case_count": 1}
🔧 Step 3: calculate_resolution_time({"service": "IAM"})
→ {"avg_days": 10.0, "min_days": 10, "max_days": 10, "case_count": 1}
🔧 Step 3: calculate_resolution_time({"service": "Bedrock"})
→ {"avg_days": 16.0, "min_days": 16, "max_days": 16, "case_count": 1}🤖 Agent (4 steps):
Here are the resolution times by service, ranked from longest to shortest:Longest Resolution Times:
- RDS Proxy: 68 days
- Aurora: 35 days
- MSK: 16 days
- Bedrock: 16 days
- Lambda: 12 days
- EKS: 11 days
- EC2: 10 days (tied)
- EFS: 10 days (tied)
- IAM: 10 days (tied)
- S3: 4 days
Key findings:
- RDS Proxy takes by far the longest at 68 days - nearly double the second-longest service
- Aurora (35 days) is the second most problematic
- Database-related services (RDS Proxy, Aurora) dominate the top of the list
- S3 is your fastest service at just 4 days
- There's a significant performance gap between your slowest (68 days) and fastest (4 days) services
Recommendation: Focus improvement efforts on RDS Proxy and Aurora cases - they're taking 4-17x longer than your faster services.
Question 3: Follow-up (tests memory)
This one requires the agent to remember the previous answer:
agent.run("For that slowest service, show me the actual cases.")
Output:
🧑 You: For that slowest service, show me the actual cases.
🔧 Step 1: search_cases({"service": "RDS Proxy"})
→ [
{
"id": "CASE-002",
"subject": "RDS Connection Timeout",
"severity": "Urgent",
"service": "RDS Proxy...🤖 Agent (2 steps):
Here's the RDS Proxy case that's causing the 68-day resolution time:CASE-002: RDS Connection Timeout
- Severity: Urgent
- Account: account2-production
- Owner: engineer1
- Created: December 30, 2025
- Resolved: March 8, 2026
- Resolution Time: 68 days
Key concerns:
- This was marked as Urgent severity but took over 2 months to resolve
- It's a production account case, which makes the long resolution time particularly problematic
- Connection timeout issues typically should be resolved much faster, especially for urgent cases
Recommendation: This case warrants a post-mortem review to understand why an urgent RDS connectivity issue took 68 days to resolve in a production environment. This could indicate process gaps, resource constraints, or technical complexity that needs addressing.
Question 4: Complex multi-step reasoning
agent.run("Compare production vs non-production accounts. Which has more urgent cases and worse resolution times?")
Output:
🧑 You: Compare production vs non-production accounts. Which has more urgent cases and worse resolution times?
🔧 Step 1: get_statistics({"field": "account"})
→ {
"account2-production": 2,
"account4-production": 2,
"account3-innovation": 2,
"account1-production": 1,
"acc...
🔧 Step 2: search_cases({"severity": "Urgent"})
→ [
{
"id": "CASE-001",
"subject": "EC2 Instance Rebooted",
"severity": "Urgent",
"service": "EC2",
...
🔧 Step 3: search_cases({"account": "account2-production"})
→ [
{
"id": "CASE-002",
"subject": "RDS Connection Timeout",
"severity": "Urgent",
"service": "RDS Proxy...
🔧 Step 3: search_cases({"account": "account4-production"})
→ [
{
"id": "CASE-004",
"subject": "S3 GuardDuty Finding",
"severity": "High",
"service": "S3",
"acc...
🔧 Step 3: search_cases({"account": "account1-production"})
→ [
{
"id": "CASE-001",
"subject": "EC2 Instance Rebooted",
"severity": "Urgent",
"service": "EC2",
...
🔧 Step 3: search_cases({"account": "account3-production"})
→ [
{
"id": "CASE-003",
"subject": "EFS Throughput High",
"severity": "Urgent",
"service": "EFS",
"a...
🔧 Step 4: search_cases({"account": "account3-innovation"})
→ [
{
"id": "CASE-006",
"subject": "EKS Pods Failing",
"severity": "Normal",
"service": "EKS",
"acco...
🔧 Step 4: search_cases({"account": "account5-innovation"})
→ [
{
"id": "CASE-005",
"subject": "Lambda Memory Limit",
"severity": "Low",
"service": "Lambda",
"a...
🔧 Step 4: search_cases({"account": "account2-non-production"})
→ [
{
"id": "CASE-007",
"subject": "MSK Cluster Stuck in Upgrade",
"severity": "Low",
"service": "MSK",
...🤖 Agent (5 steps):
Here's the comparison between production and non-production accounts:Case Distribution by Account Type:
Production Accounts (6 cases total):
- account2-production: 2 cases
- account4-production: 2 cases
- account1-production: 1 case
- account3-production: 1 case
Non-Production Accounts (4 cases total):
- account3-innovation: 2 cases
- account5-innovation: 1 case
- account2-non-production: 1 case
Urgent Cases Analysis:
Production: 3 out of 3 urgent cases (100%)
- CASE-001: EC2 Instance Rebooted (10 days)
- CASE-002: RDS Connection Timeout (68 days)
- CASE-003: EFS Throughput High (10 days)
Non-Production: 0 urgent cases
Resolution Time Analysis:
Production Accounts:
- Average: ~22 days
- Range: 4-68 days
- Worst case: 68 days (RDS Proxy timeout)
Non-Production Accounts:
- Average: ~13 days
- Range: 11-16 days
- More consistent performance
Key Findings:
- Production accounts have ALL the urgent cases - this makes sense but shows higher criticality
- Production accounts have worse resolution times on average (22 vs 13 days)
- Production accounts show higher variability - from 4 days to 68 days
- The single worst case (68 days) is in production, which is particularly concerning
Recommendation: Focus improvement efforts on production account case handling, especially urgent cases. The 68-day RDS Proxy case in production needs immediate process review.
What You've Learned
Congratulations — you've built an AI agent from scratch and understood every piece of the stack:
| Lesson | Concept |
|---|---|
| 1 | LLMs are text-in, text-out functions |
| 2 | System prompts control AI behavior |
| 3 | AI can return structured data (JSON) |
| 4 | Memory = you managing a message list |
| 5 | Tools = AI requesting YOUR code to run |
| 6 | Agent loop = Think → Act → Observe → Repeat |
| 7 | Put it all together = a real agent |
Every AI agent uses these same patterns. The difference is just the tools, the prompts, and the scale. The patterns are the same. Once you see them, you can't unsee them.
Previous: Lesson 6 — The agent loop: chaining multiple tool calls automatically →
- Language
- English
Relevant content
- asked a year ago
- asked a year ago
AWS OFFICIALUpdated 2 years ago