Claude code - under the hood | Tarek Akik Sohan

Ever asked the question on “How claude code actually get things done? how the magic happens?” Lets dig deeper into the design and try to find the what its doing under the hood.

There are four main parts of the architecture.

Agent: The Main loop

The magic is in its simplicity. A few file operation tools, and a loop that keeps calling tools until there are no more tool requests by the LLM.

This is perfect example of ReAct style agent. Think of it like this, when you give claude code a task, it doesn’t just execute once and stop, Instead it enters a conversation loop where:

Your propmt triggers the first turn - “Build me a new feature with x,y,z requirements”
Claude gets the request and choose to call few tools. like
- glob_tool to find file with patterns (**/*.py)
- view_tool to view content of any file
- create_file to create new files
- etc.
Tools execute and return results - These are tool responses that fed back into the conversation history
LLM see the results and decided what to do next - After receiving the llm started reasoning and decides
- Either more tool call - need more context/action
- Asks user for more clarification or context
- Returns that tasks is complete
Loop continues until this happens
- LLM has no more action to take.

In this pattern agents actually do this few things,

Reasoning(LLM Response) → Acting(call tools) → Observing(tool results)

Claude doesn’t just execute your instruction and stops. It keep going, making decision, observing, and making more actions until it reaches a success/failure point.

Sub Agent: The Magic

The main purpose of this is Context optimisation.

When Claude invokes the Task tool (as defined in the tool description), it spawns a sub-agent with its own LLM and toolset. All the sub-agents tool call, iteration, each token remains isolated within that sub-agent’s context. The main agent only receives the final summary.

So with this approach tasks that takes 30k token to complete, with sub agents we just need 200 token in the main agent context because we are not storing the sub agents tool call results in the main agent context, we are just saving the summary of what the sub agents did internally.

Tools: The Foundation

At the lowest level, we have the tool stack. Each tool is a simple, well-defined function that does one thing well. Grep searches files, Read displays content, Edit modifies text, TodoWrite tracks progress. Nothing fancy, just clean interfaces with clear inputs and outputs. The power comes not from sophisticated individual tools, but from how they work together. When the agent can Grep → Read → Edit → Test in a loop, simple tools become capable of complex work. It’s Unix philosophy applied to AI agents: small, focused tools that work together seamlessly.

Prompt Engineering: The Secret Sauce

Claude Code’s prompts are a showcase in controlling a large language model. The system prompt is over 2,600 tokens, and the tool descriptions add a massive 9,200 tokens of detailed instructions, rules, and examples.

The system prompt very much detailed, elegant and also packed with example cases to minimise model hallucination.

One of core strength of Claude code is the tool catalogue. Lets dive deeper into few tools and their purposes.

Task (Sub Agent)

Purpose: Initiate an isolated AI agent to handle complex searches/research while keeping main context clean.

How it works: Main agent delegates a task to a sub-agent. The sub-agent has its own separate conversation and tools (Bash, Grep, Read, Edit, etc.). It performs the work autonomously through multiple iterations, then returns only a final summary back to the main agent.

Simple function implementation:

def task(description: str, prompt: str):
    sub_agent = spawn_new_agent()
    result = sub_agent.run(prompt)  # Sub-agent does all the work
    return result  # Only the summary

Example scenario: You need to understand a merchant’s Next.js codebase structure before making changes. - Main agent will generate the prompt for sub agent.

Main agent delegates:

{
  "description": "Analyze codebase structure",
  "prompt": "Search the entire Next.js project and find: 1) All page routes in app/ directory, 2) Shared components location, 3) API routes and what they do, 4) Styling approach used. Return a structured summary of the project organization."
}

What happens inside sub-agent (invisible to main agent):

Sub-agent uses Grep to find all pages: 500 tokens
Sub-agent uses Read on 10 files: 5,000 tokens
Sub-agent uses Grep to find components: 500 tokens
Sub-agent uses Read on 5 more files: 2,500 tokens
Sub-agent analyzes and summarizes: 1,000 tokens
Total sub-agent work: 9,500 tokens (all discarded after completion)

Main agent receives:

Project Structure Summary:
- Pages: 8 routes in app/ (home, products, cart, checkout)
- Components: Located in src/components/, using functional React patterns
- API: 4 endpoints in app/api/ (auth, products, orders, payments)
- Styling: TailwindCSS with custom theme in tailwind.config.js

**Only 200 tokens stored in main context**

Claude code can spawn mini Claude code. Simple architectural choice but makes a massive difference.

Grep (Search)

Purpose: Fast text search across your project using ripgrep.

How it works: Grep takes a search query(like a variable name or function) and optionally a glob pattern (like “*.py” to search only Python files), then returns which files contain that query.

Simple function implementation:

def grep(pattern: str, glob: str = None):
    result = [list of files where pattern was found]
    return result

Example use case: You want to find where the variable thisIsAVar is used in your codebase.

Tool call:

{
  "pattern": "thisIsAVar",
  "glob": "**/*.ts"
}

Returns: List of files containing that variable:

src/utils/helper.ts
src/components/Form.ts
src/services/api.ts

Two content modes:

Mode 1 - files_with_matches (default): Returns just the file paths where the pattern was found. Quick way to see which files contain your search term.

Mode 2 - content: Returns the actual lines of code where the pattern appears, along with optional context lines around each match. Use this when you want to see the surrounding code.

Read

Purpose: Read the contents of any file from the filesystem.

How it works: Read takes an absolute file path(maybe we have from the grep or glob tool response) and returns the file contents with line numbers (like cat -n). By default it reads up to 2000 lines from the start. For large files, you can specify an offset and limit to read specific portions.

Simple function implementation:

def read(file_path: str, offset: int = None, limit: int = None):
    content = [file contents with line numbers]
    return content

Example scenario: You found a file using grep and now want to see what’s inside it.

Tool call:

{
  "file_path": "/home/tarek/wonder/src/config.ts"
}

Returns: File contents with line numbers:

  1  const API_KEY = process.env.API_KEY;
  2  const API_URL = "https://api.example.com";
  3
  4  export const config = {
  5    apiKey: API_KEY,
  6    apiUrl: API_URL
  7  };

TodoWrite

Purpose: Create and manage a task list to track progress and show users what the agent working on.

How it works: TodoWrite takes an array of todo items, each with content (description), status (pending/in_progress/completed), priority (high/medium/low), and a unique id. Updates the entire todo list with each call to reflect current progress.

Simple function implementation:

def todo_write(todos: list[dict]):
    # Each todo has: id, content, status, priority
    # Updates the task list
    # Shows progress to user
    return success

Example scenario: User asks you to “Add dark mode toggle and run tests”.

Create todos:

{
  "todos": [
    {
      "id": "1",
      "content": "Create dark mode toggle component",
      "status": "pending",
      "priority": "high"
    },
    {
      "id": "2",
      "content": "Add dark mode state management",
      "status": "pending",
      "priority": "high"
    },
    {
      "id": "3",
      "content": "Run tests and fix any failures",
      "status": "pending",
      "priority": "medium"
    }
  ]
}

Only ONE task can be in_progress at a time

Rest of the tools and their description can be found here.

All tools together

User says: “Password reset isn’t working.” Main agent starts a sub-agent to search the codebase. Sub-agent uses Grep to find “password reset”, reads the files, discovers wrong API endpoint in email.ts, returns summary on his findings so far. Main agent creates todos, uses Read to view the file, uses Edit to fix “/api/reset” to “/api/auth/reset”, runs tests. Done.

Claude code doesn’t have code retrieval approach like cursor. It uses fully agentic approach to read/edit/write files to gather context. This is why its feels so fast and accurate most of the time. I never used cursor after using claude code.

So what’s the secret?? there isn’t one. Claude code is not a magic its just built on fundamental first principle thinking. It also teaches that the model not necessarily need to be big or trillions of parameter, its about how we instruct the model. It also proves that to build reliable system we don’t need to follow the most sophisticated architecture. The most powerful agents aren’t built on complex architectures, they’re built on simple tools, clear instructions, and smart context management.

Really this is it, If you are building agents lets talk