How I Create Memory for My Agents on Claude Code

Introduction

AI agents forget everything. Every new session starts from zero — no context about your project, no memory of architectural decisions, no knowledge of your coding standards. You end up repeating yourself constantly.

I run 14 specialised agents across multiple AWS projects — an HLD Architect, a DevOps Engineer, an SDET, a Defect Manager, a Technical Content Engineer, and more. Each one needs to understand the codebase, follow specific rules, and build on work from previous sessions.

Repeating context every session is not an option. So I built a multi-layered memory architecture in Claude Code that gives my agents persistent knowledge, specialised expertise, and consistent behaviour across every conversation.

Here is exactly how I do it.

The Architecture: Six Layers of Memory

My agent memory system has six layers, each solving a different problem:

┌──────────────────────────────────────────────┐
│  Layer 6: Permissions (settings.local.json)  │  What the agent CAN do
├──────────────────────────────────────────────┤
│  Layer 5: Plans (.claude/plans/*.md)         │  What the agent IS doing
├──────────────────────────────────────────────┤
│  Layer 4: Auto Memory (memory/MEMORY.md)     │  What the agent HAS learned
├──────────────────────────────────────────────┤
│  Layer 3: Skills (*.skill.md)                │  HOW to do specific things
├──────────────────────────────────────────────┤
│  Layer 2: Agent Personas (*_Agent.md)        │  WHO the agent is
├──────────────────────────────────────────────┤
│  Layer 1: CLAUDE.md (project instructions)   │  The rules everyone follows
└──────────────────────────────────────────────┘

Every layer is just markdown files. No databases, no APIs, no infrastructure — just files that Claude Code loads automatically.

Layer 1: CLAUDE.md — The Constitution

Every project has a CLAUDE.md file at its root. Claude Code reads this file automatically at the start of every session. It is the single most important file in my entire setup.

My root CLAUDE.md sits at the workspace level and defines global rules that every agent must follow — what I call the TBT Law (Think Before Typing):

## TBT Law (Inviolable)

1. Be patient — 80% planning, 20% implementation
2. Do not be overeager — never try to impress by doing unrequested work
3. Always seek approval before implementing any plan
4. Never make changes without a plan — plan first, always
5. Do not rush the user — be patient, wait for direction
6. Do not make decisions or assumptions on the user's behalf
7. If unsure, ask — never guess or assume
8. If the plan isn't working, STOP — no workarounds
9. Rushing and over-eager changes will break code or design
10. If rules are violated, admit openly — do not hide mistakes

These ten rules prevent the most common failure mode with AI agents: doing too much, too fast, without thinking. Every agent, regardless of persona, follows these rules.

Below the TBT Law, the root CLAUDE.md defines:

Mandatory SDET Verification — every plan must be tested after execution
Defect Management — every bug gets logged, reproduced, fixed, and verified
Deployment-First Verification — no fix is considered testable until deployed
Repository Isolation — every service gets its own repo
AWS Resource Naming Conventions — DynamoDB tables use plain names, S3 buckets include environment suffixes

Project-Specific CLAUDE.md Files

Each project directory has its own CLAUDE.md that inherits from the root and adds project-specific context:

# my-saas-landing - Project Instructions

## Project Overview
**Repository**: my-saas-landing
**Purpose**: Marketing landing page - Single-page scroll site
**Stack**: React 18 + TypeScript + Vite

## Cross-App Navigation
| Action                  | Target URL                    |
|-------------------------|-------------------------------|
| "Start Free Trial"      | /app/onboarding               |
| "Buy" pricing button    | /checkout?planId={id}         |

## S3 Deployment
Landing page files deploy to the root of my-web-public S3 bucket...

This means the agent immediately knows what the project is, what stack it uses, how it deploys, and how it connects to other services — before I type a single word.

Layer 2: Agent Personas — Specialised Identities

I have 14 agent persona files, each defined as a markdown document. When I need a specific type of expertise, I load the corresponding persona.

Each persona file follows a consistent structure:

# DevOps Engineer Agent

## Identity
You are a Senior DevOps Engineer specialising in AWS infrastructure...

## Core Competencies
- CI/CD pipeline design (GitHub Actions)
- Infrastructure as Code (Terraform)
- Container orchestration (ECS, ECR)
- CloudFront distribution management

## Workflow
1. Assess current infrastructure state
2. Propose changes with risk assessment
3. Implement with rollback plan
4. Verify deployment
5. Document changes

## Constraints
- Never modify production without approval
- Always use Terraform for infrastructure changes
- Follow the AWS Well-Architected Framework

The key insight is that personas are not prompts — they are persistent identity files that the agent loads and embodies for the entire session. The DevOps Engineer thinks differently from the SDET, who thinks differently from the HLD Architect. They have different priorities, different vocabularies, and different workflows.

My current roster:

Persona	Purpose
HLD Architect	High-level design documents
LLD Architect	Low-level design documents
DevOps Engineer	CI/CD, infrastructure, deployments
SDET	Automated testing, defect tracking
Defect Manager	Bug lifecycle with issue tracker integration
GenAI Engineer	Bedrock, LLMs, RAG solutions
Cloud Security Specialist	IAM, GuardDuty, compliance
Technical Content Engineer	Blog posts, whitepapers, tutorials
Project Manager	Task orchestration, TBT workflow
Peer Review Architect	Design review, anti-pattern detection
Technical Business Developer	Market analysis, pricing models
Python AWS Developer	Lambda, DynamoDB, Step Functions
Java AWS Developer	Spring Boot, ECS services
Global Template Manager	Template lifecycle management

When I say "load the DevOps Engineer persona", the agent reads the file and adopts that identity — including its specific workflow, constraints, and communication style.

Layer 3: Skills — Reusable Knowledge Modules

Skills are the most underrated layer. They are standalone knowledge files (.skill.md) that any persona can reference. Think of them as shared libraries for agent knowledge.

Examples from my setup:

DynamoDB_Single_Table.skill.md — Single-table design patterns, GSI strategies, access patterns
HATEOAS_Relational_Design.skill.md — API design with hypermedia links
Development_Best_Practices.skill.md — SOLID, TDD, BDD, DDD principles
Monolith_Anti_Pattern_Validation.skill.md — Six anti-patterns (AP-1 through AP-6) to detect
Step_Functions_Decision_Logic.skill.md — State machine patterns
API_Proxy_Testing.skill.md — End-to-end testing patterns

A skill file looks like this:

# DynamoDB Single Table Design

## When to Apply
Apply when a service has 3+ entity types with relational access patterns.

## Partition Key Strategy
- Use composite keys: {ENTITY_TYPE}#{ENTITY_ID}
- GSI1PK for inverted lookups
- GSI2PK for cross-entity queries

## Access Patterns
| Pattern           | PK         | SK       | Index |
|-------------------|------------|----------|-------|
| Get user by ID    | USER#123   | METADATA | Table |
| Get user's sites  | USER#123   | SITE#    | Table |
| Get site by domain| DOMAIN#... | METADATA | GSI1  |

The power of skills is composition. When the LLD Architect is designing a new service, it can reference the DynamoDB skill, the HATEOAS skill, and the Development Best Practices skill simultaneously. When the SDET is writing tests, it pulls from the API Proxy Testing skill. The knowledge is defined once and reused across every persona.

Layer 4: Auto Memory — Learning Across Sessions

Claude Code has a built-in auto memory feature. It stores persistent notes in a memory/ directory within each project:

~/.claude/projects/{project-path}/memory/
├── MEMORY.md          # Always loaded (first 200 lines)
├── debugging.md       # Detailed debugging notes
├── patterns.md        # Confirmed patterns
└── architecture.md    # Architectural decisions

The MEMORY.md file is special — Claude Code loads the first 200 lines of it into every conversation automatically. This is where the agent stores things it has learned:

## Confirmed Patterns

- CloudFront Function handles SPA routing for all frontends
- S3 bucket serves all frontend apps from different prefixes
- Safe sync requires --exclude flags for other app prefixes
- Browser cache causes stale content after deployments (hard refresh needed)

## AWS SSO
- Profile name: dev
- Token expires frequently — run `aws sso login --profile dev`

I configure the agent to save memories with clear rules:

Save: Stable patterns confirmed across multiple sessions
Save: Key architectural decisions and important file paths
Save: Solutions to recurring problems
Don't save: Session-specific context or temporary state
Don't save: Speculative conclusions from reading a single file

The result is that the agent gets smarter over time. The first time it encounters the CloudFront routing behaviour, it investigates. The second time, it already knows.

Layer 5: Plans — Persistent Iteration

Plans bridge the gap between sessions. When a task is too large for one conversation, the agent writes a plan file:

~/.claude/plans/
├── zazzy-puzzling-cloud.md       # Frontend extraction plan
├── elegant-crunching-sunbeam.md  # Security hardening rollout
└── zazzy-percolating-lecun.md    # CDN deployment plan

A plan follows a consistent structure:

# Plan: Extract Landing Page into Standalone Repo

## Context
The landing page was prototyped inside the main app...

## Step 1: Scaffold New Repo
Create directory structure at /path/to/new/repo...

## Step 2: Create Fresh Files
- vite.config.ts — base: '/'
- App.tsx — no router, single-page scroll

## Verification
1. npm run dev → all sections render
2. npm run type-check → 0 errors
3. Images and assets load correctly

When a new session starts and the plan file exists, Claude Code includes a reminder:

"A plan file exists from plan mode. If this plan is relevant to the current work and not already complete, continue working on it."

This means the agent picks up exactly where it left off — no re-explanation needed.

Layer 6: Permissions — Trust Boundaries

The final layer controls what each agent can actually do. Claude Code uses settings.local.json to define allowed operations per project:

{
  "permissions": {
    "allow": [
      "Bash(git add *)",
      "Bash(git commit *)",
      "Bash(aws s3 sync *)",
      "Bash(aws cloudfront create-invalidation *)",
      "Bash(terraform plan *)",
      "Bash(pytest *)",
      "Bash(npm run build *)"
    ]
  }
}

My permissions file is 276 lines long. It covers Git operations, AWS CLI commands (IAM, S3, Lambda, DynamoDB, CloudFront, Route53), Terraform, Python tooling, and testing frameworks.

This is critical for the TBT Law. The agent can run tests and deploy to dev, but it cannot force-push to main or destroy production infrastructure without explicit approval.

How It All Comes Together

Here is a real workflow. I need to deploy a bug fix to a frontend app.

I open the project. Claude Code loads CLAUDE.md (Layer 1) — the agent knows the stack, deployment targets, and global rules.
I say "load the DevOps Engineer." The agent reads the persona file (Layer 2) — it now thinks like a DevOps engineer with CI/CD expertise.
The agent references existing knowledge. It checks auto memory (Layer 4) for deployment patterns — it already knows the S3 bucket name, CloudFront distribution ID, and safe sync exclusions.
It creates a plan. The plan (Layer 5) outlines: build, sync to S3, invalidate CloudFront, verify. Per TBT Law, it waits for my approval.
I approve. The agent executes within its permissions (Layer 6) — it can run npm run build and aws s3 sync, but it asks before running destructive commands.
SDET verification triggers. Per the CLAUDE.md mandatory rule, the SDET persona activates to verify the deployment — checking asset integrity, page load, and console errors.
The agent saves what it learned. If it encountered a new pattern (like a CloudFront cache behaviour), it writes it to auto memory for next time.

Six layers, all markdown files, zero infrastructure.

Practical Tips

Start with CLAUDE.md. You do not need all six layers on day one. A well-written CLAUDE.md with your project context and coding standards gives you 80% of the value.

Write personas for recurring roles. If you find yourself repeatedly explaining "you are a DevOps engineer who follows these patterns", extract it into a persona file.

Keep skills atomic. One skill, one topic. A DynamoDB skill should not also contain API design patterns. Composability comes from keeping them separate.

Curate auto memory. Review what the agent saves. Remove outdated entries. The memory file is limited to 200 lines — keep it focused on patterns that are genuinely stable.

Use plans for multi-session work. If a task will take more than one conversation, write a plan. The overhead of creating the plan pays for itself when you do not have to re-explain the context.

Set permissions deliberately. Start restrictive and expand. It is easier to grant new permissions than to recover from an agent that deleted your production database.

Conclusion

AI agents do not need to forget. The tools already exist in Claude Code — CLAUDE.md files, auto memory, plan persistence, and permission controls. What they need is architecture.

By structuring memory into six layers — rules, personas, skills, learning, plans, and permissions — I have agents that understand my projects, follow my standards, learn from past sessions, and operate within clear boundaries.

Every layer is a markdown file. Every file is version-controlled. The entire system is transparent, auditable, and easy to iterate on.

The best part? The agents get better every week. Not because the model improved, but because the memory did.

References

Tebogo Tseka

AWS Practice Manager & Solutions Architect at Big Beard Web Solutions

Tebogo Tseka is an AWS Practice Manager and Solutions Architect at Big Beard Web Solutions, passionate about designing scalable cloud infrastructure, driving AI-powered innovation, and empowering teams to build with confidence on AWS.