Build This Now
Build This Now
Qu'est-ce que le code Claude ?Installer Claude CodeL'installateur natif de Claude CodeTon premier projet Claude Code
How Does an LLM Actually Work? (ChatGPT and Claude, Explained Without Math)How Does AI Image Generation Work? (The Noise-to-Picture Trick)How Do AI Agents Actually Work? (The Loop That Lets AI Do Things)What Is a Token in AI? (Why ChatGPT Charges by the Token)What Is a Vector Embedding? (And How RAG Lets AI Read Your Documents)How ChatGPT's 'Dreaming' Memory Works (and What to Turn Off)Why a Hidden Line of Text Can Hijack Your AI BrowserHow Much Energy and Water Does AI Actually Use?Is AI a Bubble? 'Circular Financing' in Plain EnglishThe EU AI Act, Explained: What Changes on August 2, 2026How Do AI Voice-Cloning Scams Work? (And How to Spot One)What Is Agentic Commerce? How AI Agents Buy Things for YouWhy Does AI Run on GPUs, Not CPUs? (One Genius vs. a Thousand Interns)How Does HTTPS Work? (The Padlock, and Why Nobody Can Read Your Password)
speedy_devvkoen_salo
Blog/Handbook/Core/How Does an LLM Actually Work? (ChatGPT and Claude, Explained Without Math)

How Does an LLM Actually Work? (ChatGPT and Claude, Explained Without Math)

A large language model is a next-word prediction machine run billions of times. Here's how ChatGPT and Claude actually work — tokens, training, and attention — explained in plain English, no math.

Arrête de tout configurer. Place à la construction.

Des templates SaaS avec orchestration IA.

Published Jun 13, 20269 min readHandbook hubCore index

A large language model (LLM) like ChatGPT or Claude works by doing one simple thing astonishingly well: predicting the next word. You give it some text, and it guesses the most likely next chunk of text, adds it, then guesses again — over and over, a few words at a time, until the answer is complete. Everything else — the essays, the code, the eerily good advice — is that single trick, repeated billions of times by a system that has read most of the internet.

That sounds too simple to explain something that can write a poem or debug your code. The magic isn't in the trick; it's in how good a machine gets at the trick after reading nearly everything humans have written. Here's the whole thing, no math.

Table of Contents

  1. The One Job: Predict the Next Token
  2. What's a Token?
  3. How It Learned: Training in Three Phases
  4. Why It Seems to "Understand" — Attention
  5. Why It Confidently Makes Things Up
  6. What an LLM Is Not
  7. Frequently Asked Questions

Arrête de tout configurer. Place à la construction.

Des templates SaaS avec orchestration IA.

The One Job: Predict the Next Token

Imagine the world's most well-read autocomplete. You type "The capital of France is," and it has seen that phrase followed by "Paris" so many times that "Paris" is overwhelmingly the most likely next word. So it writes "Paris."

An LLM does exactly this, but for any text. Ask it to write an email, and it predicts the most plausible next word given everything so far — the instruction, the tone, the words it has already written. It generates the response one small piece at a time, each piece feeding back in to inform the next. There's no separate "thinking" step hiding behind the words. The generating is the thinking.

The reason it's not just dumb autocomplete is the sheer scale of what it learned from. To predict the next word well across science, code, law, jokes, and recipes, it had to internalize patterns that look a lot like knowledge and reasoning.

What's a Token?

The model doesn't actually work in words — it works in tokens, which are chunks of text. A token might be a whole word ("cat"), part of a word ("ing"), or a piece of punctuation. On average, one token is about ¾ of an English word.

Two reasons this matters to you:

  • It's why AI is priced per token, and why a long document costs more to process. (Full breakdown in what is a token.)
  • It's why the model has a memory limit. The text it can "see" at once — your prompt plus its answer — is measured in tokens, called the context window. Run past it and the earliest text falls away. (See why AI forgets what you talked about.)

How It Learned: Training in Three Phases

A model isn't programmed with facts. It's trained — shown enormous amounts of text and adjusted until it gets good at prediction. This happens in three stages:

PhaseWhat happensThe result
1. Pre-trainingRead a huge chunk of the internet, books, and code, predicting the next token trillions of timesRaw knowledge and language ability — but unfocused
2. Fine-tuningTrain on curated examples of good question-and-answer behaviorLearns to be a helpful assistant, not just an autocomplete
3. Alignment (RLHF)Humans rate answers; the model is nudged toward the preferred onesLearns to be helpful, honest, and safe

Pre-training is the giant, expensive marathon (thousands of GPUs for weeks — part of why AI uses so much energy). Fine-tuning and alignment are what turn a raw text-predictor into ChatGPT or Claude — the difference between a library and a librarian.

Why It Seems to "Understand" — Attention

The breakthrough that made modern LLMs possible is a mechanism called attention. When the model reads your text, attention lets each word "look at" every other word and decide which ones matter for what comes next.

Take: "The trophy didn't fit in the suitcase because it was too big." What does "it" refer to? You instantly know it's the trophy. Attention is how the model figures that out — it weighs the connection between "it" and every earlier word, and "trophy" wins. Do that across thousands of words and you get something that tracks context, references, and intent well enough to feel like understanding.

It isn't understanding the way you do — there's no inner life, no beliefs. But "a machine that has learned which words relate to which, across nearly everything ever written" is a genuinely powerful thing, and it explains most of what these models can do.

Why It Confidently Makes Things Up

Here's the catch that follows directly from the one job. The model is optimizing for plausible, not true. When it doesn't know something, it doesn't have a built-in "I'm not sure" signal — it just predicts the most likely-sounding next words, which can be a confidently-worded wrong answer. That's a hallucination, and it's not a bug bolted on; it's the flip side of being a fluent prediction machine.

We dig into the why in why ChatGPT makes stuff up and why AI sounds confident when it's wrong. The practical takeaway: trust it like a brilliant, fast, slightly overconfident intern — verify anything that matters.

What an LLM Is Not

A few myths worth killing:

  • It's not a database. It didn't store the internet; it learned patterns from it. It can't reliably "look up" an exact fact unless it's given tools to do so.
  • It's not conscious or thinking between messages. It's stateless — it does nothing until you send a prompt, then predicts, then stops.
  • It's not deterministic by default. Ask the same thing twice and you can get different wording, because it samples from the likely options rather than always picking the single top one.

Once the "next-token predictor trained on everything" picture clicks, almost every quirk of AI — the brilliance, the confidence, the hallucinations — starts to make sense. From here, the natural next steps are how AI image generation works (a different kind of model entirely) and how AI agents work (what happens when you give an LLM tools and a goal).

Arrête de tout configurer. Place à la construction.

Des templates SaaS avec orchestration IA.

Frequently Asked Questions

How does an LLM actually work in simple terms?

An LLM predicts the next chunk of text (a "token") based on everything written so far, then repeats that prediction over and over to build a full response. It learned to do this by training on enormous amounts of text, which taught it the patterns of language, facts, and reasoning well enough to produce useful answers.

Is ChatGPT just predicting the next word?

Yes, fundamentally — but that undersells it. Predicting the next word well across every topic humans write about requires internalizing grammar, facts, and reasoning patterns. The simplicity of the mechanism plus the scale of the training is exactly what makes it powerful.

Does an LLM understand what it's saying?

Not the way humans do. It has no beliefs or inner experience. What looks like understanding is a learned model of how words and concepts relate, powered by a mechanism called attention. It's good enough to be genuinely useful, but it's pattern prediction, not comprehension.

Why do LLMs make mistakes or "hallucinate"?

Because they optimize for plausible-sounding text, not verified truth. When the model doesn't know something, it still predicts the most likely-sounding answer, which can be confidently wrong. Hallucination is a side effect of how the technology works, not a fixable glitch — so verify anything important.

What's the difference between an LLM and an AI agent?

An LLM is the text-prediction engine. An AI agent wraps that engine in a loop and gives it tools (search, code execution, APIs) plus a goal, so it can take actions and work toward an outcome instead of just answering once. See how AI agents work.

Continue in Core

  • La Fenêtre de Contexte 1M dans Claude Code
    Anthropic a activé la fenêtre de contexte 1M tokens pour Opus 4.6 et Sonnet 4.6 dans Claude Code. Sans header beta, sans surcharge, tarification fixe, et moins de compactions.
  • AGENTS.md vs CLAUDE.md : expliqué
    Deux fichiers de contexte, une seule base de code. Comment AGENTS.md et CLAUDE.md diffèrent, ce que chacun fait, et comment utiliser les deux sans rien dupliquer.
  • Why a Hidden Line of Text Can Hijack Your AI Browser
    AI browsers read the whole web page — including text hidden from you. That's the door behind prompt injection, OWASP's #1 AI security risk in 2026. Here's how the attack works, in plain English.
  • AI Research for Builders: The Latest Breakthroughs, Explained Monthly
    A monthly digest of the latest AI research — agents, reasoning, efficiency, and models — with every claim traced to its source and translated into what it means if you build with AI.
  • 10 AI Research Breakthroughs That Matter for Builders (June 2026)
    The latest AI research, explained: AI disproved an 80-year-old math conjecture, agents got cheaper and more reliable, and inference costs dropped up to 100x. What each finding means if you build with AI.
  • Did Anthropic Call for an AI Pause? What It Actually Said
    Anthropic did not call to halt the AI boom. Here is what its June 2026 'recursive self-improvement' post actually said, why the 80%-of-its-own-code stat spooked it, and what it means if you build with Claude Code.

More from Handbook

  • Principes de base de l'agent
    Cinq façons de construire des agents spécialisés dans le code Claude : Sous-agents de tâches, .claude/agents YAML, commandes slash personnalisées, personas CLAUDE.md, et invites de perspective.
  • L'ingénierie du harness agent
    Le harness, c'est toutes les couches autour de ton agent IA sauf le modèle lui-même. Découvre les cinq leviers de contrôle, le paradoxe des contraintes, et pourquoi le design du harness détermine les performances de l'agent bien plus que le modèle.
  • Patterns d'agents
    Orchestrateur, fan-out, chaîne de validation, routage par spécialiste, raffinement progressif, et watchdog. Six formes d'orchestration pour câbler des sub-agents Claude Code.
  • Meilleures pratiques des équipes d'agents
    Patterns éprouvés pour les équipes d'agents Claude Code. Prompts de création riches en contexte, tâches bien calibrées, propriété des fichiers, mode délégué, et correctifs v2.1.33-v2.1.45.

Arrête de tout configurer. Place à la construction.

Des templates SaaS avec orchestration IA.

10 AI Research Breakthroughs That Matter for Builders (June 2026)

The latest AI research, explained: AI disproved an 80-year-old math conjecture, agents got cheaper and more reliable, and inference costs dropped up to 100x. What each finding means if you build with AI.

How Does AI Image Generation Work? (The Noise-to-Picture Trick)

AI image generators like Midjourney and DALL-E start with pure visual static and slowly remove the noise until a picture appears — guided by your words. Here's how diffusion actually works, explained simply.

On this page

Table of Contents
The One Job: Predict the Next Token
What's a Token?
How It Learned: Training in Three Phases
Why It Seems to "Understand" — Attention
Why It Confidently Makes Things Up
What an LLM Is Not
Frequently Asked Questions
How does an LLM actually work in simple terms?
Is ChatGPT just predicting the next word?
Does an LLM understand what it's saying?
Why do LLMs make mistakes or "hallucinate"?
What's the difference between an LLM and an AI agent?

Arrête de tout configurer. Place à la construction.

Des templates SaaS avec orchestration IA.