Products claude open-source product application platform integration

Multichannel AI Agent: Shared Memory Across Messaging Platforms

Dev.to AIby Elizabeth Fuentes LApril 6, 20266 min read1 views

Build an AI chatbot that remembers users across WhatsApp and Instagram using Amazon Bedrock AgentCore, unified identity, and DynamoDB message buffering You send a video on WhatsApp. You switch to Instagram. You ask about the video. The chatbot has no idea what you are talking about. Most AI chatbots treat every channel as a separate conversation with no shared context, no shared memory, and no continuity. I built a multichannel AI agent that solves this problem using Amazon Bedrock AgentCore . One deployment serves both WhatsApp and Instagram with shared memory. The agent remembers your name, your photos, your videos, and your preferences regardless of which channel you write from. Assumes familiarity with AWS CDK , AWS Lambda , and WhatsApp/Instagram API concepts. Deployment takes approxi

Build an AI chatbot that remembers users across WhatsApp and Instagram using Amazon Bedrock AgentCore, unified identity, and DynamoDB message buffering

You send a video on WhatsApp. You switch to Instagram. You ask about the video. The chatbot has no idea what you are talking about. Most AI chatbots treat every channel as a separate conversation with no shared context, no shared memory, and no continuity. I built a multichannel AI agent that solves this problem using Amazon Bedrock AgentCore.

One deployment serves both WhatsApp and Instagram with shared memory. The agent remembers your name, your photos, your videos, and your preferences regardless of which channel you write from.

Assumes familiarity with AWS CDK, AWS Lambda, and WhatsApp/Instagram API concepts. Deployment takes approximately 15 minutes per stack.

What does a multichannel AI agent with shared memory look like?

Here is the agent processing different media types on WhatsApp and responding on Instagram with full context.

How does the AI agent process voice notes on WhatsApp?

The agent transcribes voice messages automatically using Amazon Transcribe and responds based on the spoken content. The transcription is stored in memory so the agent can reference it in future conversations.

How does the AI agent analyze videos on WhatsApp?

Send a video and the agent uploads it to TwelveLabs for visual and audio analysis. It describes the content in detail and stores a reference ID so you can ask follow-up questions about the same video later.

How does the AI agent analyze images on WhatsApp?

Send a photo and the agent describes the visual content, answers questions about it, and stores the description in long-term memory. You can ask about the same image days later and the agent recalls the details.

How does cross-channel memory work between WhatsApp and Instagram?

Switch to Instagram. The agent recognizes you by name, knows your preferences, and remembers what you shared on WhatsApp. This works because both channels share the same actor_id in AgentCore Memory.

How does the architecture work?

The project uses three independent AWS CDK stacks that share configuration through AWS Systems Manager Parameter Store:

Stack Purpose Integration path

Stack 00 AI agent with persistent memory

Amazon Bedrock AgentCore Runtime + Memory

Stack 01 WhatsApp only

AWS End User Messaging Social (SNS-based)

Stack 02 WhatsApp + Instagram

Amazon API Gateway webhook, single endpoint for both platforms

The agent uses AgentCore Memory with two layers of persistence:

Short-term memory: Conversation turns within a session. Expires after a configurable TTL (Time To Live).
Long-term memory: Extracted facts, preferences, and summaries. Persists indefinitely across all sessions and channels. The extraction happens asynchronously in the background.

How does unified identity work across WhatsApp and Instagram?

When you write from WhatsApp, the system creates a deterministic user ID based on your phone number (wa-user-{phone}). When you link your Instagram account, both channels resolve to the same ID. The actor_id sent to AgentCore Memory is identical regardless of channel.

The linking happens through conversation. The agent asks new users if they also write from another channel. If you share your Instagram username or WhatsApp number, a link_account tool merges both identities in a unified users DynamoDB table.

Channel User ID format Lookup method

WhatsApp first wa-user-{phone} GSI on wa_phone

Instagram first ig-user-{sender_id} GSI on ig_id, fallback scan on ig_username

Linked Whichever was created first Both GSIs resolve to the same record

How does message buffering reduce AI invocation costs?

WhatsApp users tend to send 3-5 rapid messages instead of one long text. Without buffering, each message triggers a separate AI invocation, multiplying cost and token usage.

A DynamoDB Streams tumbling window accumulates messages from the same user for 10 seconds, then sends them as a single concatenated prompt to the agent.

User sends 3 messages in 2 seconds:  "hello" -> DDB INSERT (t=0s)  "I have a question" -> DDB INSERT (t=1s)  "about my video" -> DDB INSERT (t=2s)

User sends 3 messages in 2 seconds:  "hello" -> DDB INSERT (t=0s)  "I have a question" -> DDB INSERT (t=1s)  "about my video" -> DDB INSERT (t=2s)

Tumbling window fires at t=10s: -> Processor receives all 3 records in one batch -> Aggregates: "hello\nI have a question\nabout my video" -> Single AgentCore invocation`

Enter fullscreen mode

Exit fullscreen mode

This pattern is based on Enrique Rodriguez's sample-whatsapp-end-user-messaging-connect-chat, which reported a 4:1 aggregation ratio in real-world WhatsApp usage.

What media types does the AI agent support?

Media Processing method Memory storage

Text Direct prompt to the agent Stored as conversation event

Image

Anthropic Claude vision describes the content Text description stored in long-term memory

Audio and voice notes

Amazon Transcribe converts speech to text Transcription stored as text prompt

Video

TwelveLabs Pegasus analyzes visual and audio content Description and reference ID stored in long-term memory

Documents (PDF, DOCX, XLSX) Claude reads inline and summarizes Summary stored in long-term memory

All multimedia is converted to text understanding before entering memory. This is how the agent recalls what was in a photo or video days later, even across channels.

Frequently asked questions

Can the same agent serve WhatsApp and Instagram at the same time? Yes. Stack 02 uses a single API Gateway webhook that receives both WhatsApp and Instagram messages. The receiver Lambda detects the channel from the payload and normalizes both into a common format.

Does the agent remember conversations when switching channels? Yes. A unified users table maps WhatsApp phone numbers and Instagram IDs to a single user. When both accounts are linked, the agent uses the same actor_id in AgentCore Memory. Long-term facts and preferences persist across both channels.

What happens if I only want WhatsApp without Instagram? Deploy Stack 01 for WhatsApp via AWS End User Messaging, or deploy Stack 02 and configure only the WhatsApp secret. The agent works without Instagram when no Instagram credentials are configured.

How can I add more channels like Telegram or a web chat? The AgentCore Runtime and Memory layer is channel-agnostic. To add a new channel, create a receiver that normalizes messages into the same DynamoDB format and add a reply dispatch function. The agent and memory work without changes.

Get started

The full project with deployment instructions, Instagram setup guide, and architecture documentation:

github.com/elizabethfuentes12/whatsapp-ai-agent-sample-for-aws-agentcore

This is a demo project for learning and experimentation. If you plan to use these patterns in production, add proper security hardening, error handling, and monitoring.

Built with Amazon Bedrock AgentCore, AWS CDK, and Strands Agents. Similar patterns can be applied using LangGraph, AutoGen, or the Amazon Bedrock Agents SDK.

Gracias!

🇻🇪🇨🇱 Dev.to Linkedin GitHub Twitter Instagram Youtube Linktr

Original source

Dev.to AI

https://dev.to/aws/multichannel-ai-agent-shared-memory-across-messaging-platforms-56j4

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudeopen-sourceproduct

ReleasesLive

China cuts cost of military-grade infrared chips to as little as a few dozen USD

A research team at a Chinese university has developed a new way to make high-end infrared chips that could slash their cost dramatically and improve the performance of smartphone cameras and self-driving cars. The key breakthrough was finding a way to make the chips using conventional manufacturing techniques, rather than the exotic, costly materials that were relied on before. Mass production is set to begin by the end of the year, according to a press release from Xidian University. The chips...

SCMP Tech (Asia AI)

1m19 minutes ago

Open Source AIFresh

[llama.cpp] 3.1x Q8_0 speedup on Intel Arc GPUs - reorder optimization fix (PR submitted)

TL;DR : Q8_0 quantization on Intel Xe2 (Battlemage/Arc B-series) GPUs was achieving only 21% of theoretical memory bandwidth. My AI Agent and I found the root cause and submitted a fix that brings it to 66% - a 3.1x speedup in token generation. The problem : On Intel Arc Pro B70, Q8_0 models ran at 4.88 t/s while Q4_K_M ran at 20.56 t/s; a 4x gap that shouldn't exist since Q8_0 only has 1.7x more data. After ruling out VRAM pressure, drivers, and backend issues, we traced it to the SYCL kernel dispatch path. Root cause : llama.cpp's SYCL backend has a "reorder" optimization that separates quantization scale factors from weight data for coalesced GPU memory access. This was implemented for Q4_0, Q4_K, and Q6_K - but Q8_0 was never added. Q8_0's 34-byte blocks (not power-of-2) make the non-r

Reddit r/LocalLLaMA

2mabout 7 hours ago

ModelsFresh

I benchmarked 37 LLMs on MacBook Air M5 32GB — full results + open-source tool to benchmark your own Mac

So I got curious about how fast different models actually run on my M5 Air (32GB, 10 CPU/10 GPU). Instead of just testing one or two, I went through 37 models across 10 different families and recorded everything using llama-bench with Q4_K_M quantization. The goal: build a community benchmark database covering every Apple Silicon chip (M1 through M5, base/Pro/Max/Ultra) so anyone can look up performance for their exact hardware. The Results (M5 32GB, Q4_K_M, llama-bench) Top 15 by Generation Speed Model Params tg128 (tok/s) pp256 (tok/s) RAM Qwen 3 0.6B 0.6B 91.9 2013 0.6 GB Llama 3.2 1B 1B 59.4 1377 0.9 GB Gemma 3 1B 1B 46.6 1431 0.9 GB Qwen 3 1.7B 1.7B 37.3 774 1.3 GB Qwen 3.5 35B-A3B MoE 35B 31.3 573 20.7 GB Qwen 3.5 4B 4B 29.4 631 2.7 GB Gemma 4 E2B 2B 29.2 653 3.4 GB Llama 3.2 3B 3B 2

Reddit r/LocalLLaMA

3mabout 7 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 207 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

ProductsFresh

Tech companies are cutting jobs and betting on AI. The payoff is far from guaranteed

AI experts say we’re living in an experiment that may fundamentally change the model of work Sign up for the Breaking News US email to get newsletter alerts in your inbox Hundreds of thousands of tech workers are facing a harsh reality. Their well-paying jobs are no longer safe. Now that artificial intelligence (AI) is here, their futures don’t look as bright as they did a decade ago. As US tech companies have ramped up investments in AI, they’ve slashed a staggering number of jobs. Microsoft cut 15,000 workers last year . Amazon laid off 30,000 employees in the last six months. Financial-services company Block eliminated more than 4,000 people, or 40% of its workforce, in February. Meta laid off more than 1,000 in the last six months, and, according to a Reuters report, may cut 20% of all

The Guardian AI

1mabout 5 hours ago

ProductsLive

How AI Is Transforming Cybersecurity and Compliance — A Deep Dive into PCI DSS

The intersection of artificial intelligence and cybersecurity is no longer a future concept — it is the present reality shaping how organizations defend their data, detect threats, and demonstrate regulatory compliance. As cyber threats grow in sophistication and volume, traditional rule-based security tools are struggling to keep pace. AI is filling that gap with speed, precision, and adaptability that human analysts alone cannot match. Nowhere is this transformation more consequential than in the world of payment security and compliance. The Payment Card Industry Data Security Standard (PCI DSS) — the global framework governing how organizations handle cardholder data — has long been a compliance burden for businesses of all sizes. AI is now fundamentally changing how companies achieve,

DEV Community

9m43 minutes ago

ProductsLive

Resume Skills Section: Best Layout + Examples (2026)

Your skills section is the most-scanned part of your resume after your name and current title. ATS systems use it for keyword matching. Recruiters use it as a 2-second compatibility check. If it's poorly organized, buried at the bottom, or filled with the wrong skills, both audiences move on. Where to Place Your Skills Section Situation Best Placement Why Technical role (SWE, DevOps, data) Below name, above experience Recruiters check your stack before reading bullets Non-technical role (PM, marketing, ops) Below experience Experience and results matter more Career changer Below name, above experience Establishes relevant skills before unrelated job titles New grad / intern Below education, above projects Education sets context, skills show what you can do The rule: place skills where they

DEV Community

6m43 minutes ago

ProductsLive

Securing Plex on Synology NAS with Post-Quantum Cryptography via Cloudflare Tunnel

Introduction Securing remote access to a Plex media server hosted on a Synology NAS device presents a critical challenge, particularly in the face of advancing quantum computing capabilities. Traditional encryption algorithms, such as RSA and Elliptic Curve Cryptography (ECC), rely on the computational infeasibility of tasks like integer factorization and discrete logarithm problems. Quantum computers, leveraging Shor’s algorithm, can solve these problems exponentially faster, rendering traditional encryption obsolete. This vulnerability is not a speculative future concern but an imminent threat, especially for internet-exposed services like Plex. Without post-quantum cryptography (PQC), Plex servers—and the sensitive data stored on Synology NAS devices—are susceptible to quantum-enabled d

DEV Community

19m42 minutes ago