AI & ML March 28, 2026

How Does Chain of Thought Work?

A 6-minute read

When you ask an AI to solve a complex problem, it often shows its work step by step. That reasoning does not happen automatically. Chain of thought is a prompting technique that forces the model to reason explicitly, and it changes the quality of answers entirely.

In 2022, researchers at Google Brain asked a simple question: what happens if you ask an AI to show its work? They found that forcing the model to write out its reasoning step by step made it dramatically better at math problems and complex logic. This was surprising because nothing about the model’s underlying architecture changed. The model was still doing the same prediction task. But by asking it to produce a chain of thought, they unlocked capabilities that were already there but hidden.

The short answer

Chain of thought (CoT) is a prompting technique where you explicitly ask the model to show its step-by-step reasoning before giving the final answer. Instead of just answering the question, the model outputs a sequence of intermediate steps: first I need to do this, then this, then this, and finally this gives me the answer. This technique works because it matches how large language models actually work: they predict tokens, and showing reasoning forces them to make better predictions on complex problems by breaking them into smaller, more predictable pieces. Learn more about prompting techniques

The full picture

Why chain of thought works

Large language models predict the next token in a sequence. They are trained on massive amounts of text and learn statistical patterns: given this text, what comes next? When you ask a model a complex question, it tries to predict what a good answer looks like. For simple questions, that works fine. For complex questions, the model tries to jump directly to the answer, and that is where errors creep in.

Chain of thought changes this by restructuring the prediction task. When you ask the model to think step by step, you are not asking for new capabilities. You are asking it to predict a longer sequence: first a reasoning step, then another reasoning step, then the final answer. Each reasoning step is easier to predict correctly than the full answer. The model gets to use its predictions about one step to inform the next.

This is why chain of thought especially helps with math, logic, and multi-step problems. A problem like 47 times 89 requires several subtractions and multiplications. If the model jumps directly to the answer, any single error breaks everything. If it shows its work, it can catch its own mistakes mid-way and correct them. The reasoning steps act as a checkpoint system. A detailed explanation of reasoning in AI

The original paper on chain of thought prompting from Google Research in 2022 showed that this technique improved accuracy on math word problems from 17% to 57% on some tasks. That is a massive jump from just changing how the prompt was written.

How to use chain of thought

The simplest way is to add a phrase to your prompt: “think step by step” or “show your reasoning.” This works on many modern models without any special setup. The model learns from its training data that when asked to think step by step, it should output reasoning before answering.

For more complex problems, you can provide examples. Instead of just saying “think step by step,” you can show the model a few examples of a question, a reasoning chain, and the correct answer. This is called few-shot chain of thought. The examples teach the model exactly what format you want the reasoning to follow.

For instance, if you want the model to solve word problems by first extracting the key numbers, then deciding what operation to use, then calculating, you can show it two or three problems that follow exactly that pattern. The model learns the structure and applies it to new problems.

Chain of thought variants

Researchers have developed several variations that push the technique further.

Self-consistency: Instead of running the model once, run it multiple times with the same chain of thought prompt, but introduce some randomness in the generation. You get several different reasoning paths. Then take the majority answer. This improves accuracy further because different reasoning paths are likely to agree on the right answer but disagree on wrong ones.

Tree of thought: Instead of one reasoning path, the model explores multiple branches. It might consider several approaches to a problem in parallel, evaluate each, and pick the best. This is more expensive (the model generates more tokens) but can solve harder problems that require creative approaches or backtracking.

Zero-shot chain of thought: Even without examples, simply adding “let’s think step by step” to a prompt often works. The model has learned from its training data that this phrase signals a multi-step reasoning task, and it adjusts its output accordingly. This is the simplest form of the technique.

Why it matters in real life

For users: If you use AI for any task involving logic, math, or multi-step reasoning, adding chain of thought to your prompts improves results. It costs nothing extra and takes seconds. The difference can be substantial, especially on complex problems.

For developers: Chain of thought is one of the simplest ways to improve AI performance without changing the underlying model. When building AI applications, you can design your prompts to encourage step-by-step reasoning, especially for critical tasks where errors are costly.

For AI researchers: Chain of thought revealed that reasoning capabilities in large models were often dormant, waiting for the right prompting strategy to emerge. This has become a fundamental insight in AI development, driving research into more sophisticated reasoning techniques.

Common misconceptions

“Chain of thought makes the AI actually think.”

No. The model is still predicting tokens. What changes is the structure of what it predicts. It predicts reasoning tokens before answer tokens, and those reasoning tokens make the answer tokens more accurate. There is no consciousness or understanding happening.

“You need special models for chain of thought.”

Most modern assistant models (GPT-4, Claude 3, Gemini) respond to chain of thought prompts out of the box. They were trained on enough examples of step-by-step reasoning that the pattern is embedded in their behavior.

“Chain of thought always improves results.”

For simple questions, it adds overhead without benefit. A question like “what is the capital of France?” does not need step-by-step reasoning. Using chain of thought there just makes the response longer. The technique matters most for complex, multi-step problems.

Key terms

Prompting: Giving the model instructions in natural language to guide its output. Chain of thought is a specific prompting technique.

Few-shot learning: Providing a few examples in the prompt to teach the model a pattern. Few-shot chain of thought means showing examples of reasoning steps.

Zero-shot: Using a prompt without any examples. Zero-shot chain of thought is just adding “think step by step” without examples.

Self-consistency: Running multiple reasoning paths and taking the majority vote. Improves accuracy by filtering out reasoning errors.

Tree of thought: An extension that explores multiple reasoning branches in parallel, not just one linear path.