How Recommendation Algorithms Work

Netflix once ran a public competition offering $1 million to anyone who could improve its recommendation algorithm by 10%. In September 2009, a team called BellKor’s Pragmatic Chaos won — their algorithm improved predictions by 10.06% Wired. Netflix never actually used their solution because by the time the contest ended, most users had moved from DVDs to streaming, and the whole problem had changed. That contest is a useful window into how recommendation systems really work: not magic, not omniscience, but an ongoing race against shifting human behavior.

The short answer

Most recommendation systems are built on a core idea called collaborative filtering: find users who have similar behavior to you, and recommend things those users liked that you haven’t encountered yet. The algorithm doesn’t need to understand what a movie is about or why someone liked it. It just needs to identify patterns in behavior across millions of users.

This is combined with content-based filtering (recommending items similar to ones you already liked) and increasingly, deep learning models that blend many signals at once. The result is a system that feels personalized but is essentially asking: “What have people like you done next?”

The full picture

The problem being solved

Imagine you’re a library with ten million books. A visitor arrives. They don’t know what they want. They just want something good. How do you help them?

You could recommend the most popular books (the bestseller table). You could ask them what genres they like and search by category. Or you could notice that their reading history looks a lot like another regular patron, and recommend books that patron loved.

The third approach is collaborative filtering, and it scales enormously. Instead of categorizing content (which requires human judgment or expensive analysis), you let user behavior do the work.

Collaborative filtering: the core engine

The foundational version works like this. You have a matrix of users and items. Each cell contains a rating or an implicit signal of engagement (a watch, a click, a purchase). Most cells are empty: any given user has only interacted with a tiny fraction of all available items.

The algorithm’s job is to fill in the blanks. If user A and user B have both rated movies X, Y, and Z highly, and user A also rated movie W highly, the algorithm predicts user B will probably like W too. This is user-based collaborative filtering.

Alternatively, you can flip the perspective. If movies X and W are frequently liked by the same people, they’re similar in some meaningful way, even if you don’t know why. If you liked X, you might like W. This is item-based collaborative filtering.

Both approaches run into a problem at scale: with millions of users and millions of items, the matrix is enormous and mostly empty. Direct comparison becomes computationally prohibitive.

Matrix factorization: finding hidden structure

The standard solution is matrix factorization. The idea: compress the giant user-item matrix into two smaller matrices. One maps users onto a set of hidden features (latent factors). The other maps items onto the same features. The hidden features aren’t defined by anyone; they emerge from the data.

Multiply the two smaller matrices back together, and you get a filled-in prediction for every user-item pair. The latent factors might correspond to things like “action movies,” “foreign cinema,” or “cerebral thrillers,” but they’re not labeled. They’re just mathematical dimensions that capture variation in preferences.

Training means adjusting the values in both matrices so that the products of the known cells match actual observed ratings as closely as possible. This is a machine learning optimization problem solved with gradient descent (the same process used to train neural networks).

Beyond ratings: implicit signals

Most modern recommendation systems don’t rely on star ratings, which are sparse and biased. Instead, they use implicit feedback: what you clicked, how long you watched, whether you finished, whether you came back to it, whether you shared it.

These signals are noisier than explicit ratings (watching something halfway through might mean you loved it or hated it) but far more abundant. A user might rate a handful of items but generate thousands of implicit signals.

The weight given to different signals varies. A full watch-through is a stronger positive signal than a click. Stopping 10% through is a weak negative signal. Rewatching is a very strong positive signal. Systems learn these weightings from large-scale experiments.

Context: when you ask matters

The same user wants different things in different contexts. You don’t watch the same things at 7am on a weekday and 10pm on a Saturday. You don’t shop for the same things on mobile while commuting versus on desktop at home.

Modern recommendation systems incorporate context signals: time of day, device type, current session behavior, location (if available), and even what you just watched or bought. The recommendation isn’t just “what does this user like” but “what does this user want right now.”

The cold start problem

A new user has no history. A new item has no interactions. Collaborative filtering fails because there’s no data to collaborate on. This is called the cold start problem.

Systems address it by falling back to simpler strategies: ask new users about preferences in an onboarding flow, recommend popular items, or use demographic data as a proxy. For new items, content metadata (genre, tags, description) can bootstrap recommendations until behavioral data accumulates.

The feedback loop problem

Recommendation systems create feedback loops. Items that get recommended get more engagement, which generates more data, which makes the algorithm recommend them more. Popular items become more popular. Niche content struggles to get initial exposure.

More subtly, if the system learns that slightly provocative content generates more engagement (longer watch times, more reactions), it will recommend more of it. This is not a bug in the algorithm: it’s doing exactly what it was optimized to do. It’s a values problem about what to optimize for.

Reinforcement learning from human feedback: the new frontier

The next generation of recommendation isn’t just better collaborative filtering. It’s systems that learn from direct feedback in real-time, using the same techniques that train large language models.

Reinforcement Learning from Human Feedback (RLHF) is a training approach where a model learns by optimizing for signals of human approval — the same technique used to train large language models to be helpful. In recommendation contexts, this means systems can learn from subtle signals beyond simple clicks: how long you hesitate before clicking, whether you scroll past without pausing, whether you share something or just consume it.

TikTok’s algorithm is widely credited as a step-change in recommendation sophistication. For a detailed breakdown of how social media algorithms rank and distribute content, see our dedicated article. Unlike older systems that needed a significant history to work well, TikTok’s algorithm can personalize to a new user within the first few videos, using a combination of content features (what’s in the video) and behavioral signals. It also uses a wider set of feedback signals than earlier platforms: completion rate, replays, shares, comments, profile clicks, weighted differently depending on how much effort each action requires.

This sophistication comes with a darker implication. When algorithms optimize purely for engagement, they can discover that emotional volatility drives more engagement than contentment. A video that makes you slightly angry, slightly envious, or slightly anxious keeps you watching longer than one that simply makes you happy. There’s growing evidence that engagement-maximizing algorithms preferentially surface emotionally arousing content, not out of malice, but because that’s what the optimization target leads to. Designing recommendation systems that optimize for user wellbeing rather than just engagement time is one of the open research problems that will matter enormously over the next decade.

Common misconceptions

Recommendation algorithms know what you want. They don’t. They find people who behaved like you and guess you’ll like what those people liked. The algorithm doesn’t understand your preferences; it spots patterns in aggregate behavior.

More recommendations mean more diversity. The opposite is often true. Algorithms optimize for engagement, so they tend to recommend more of what you’ve already shown you like. Over time, recommendations can narrow rather than broaden your exposure.

You can “train” an algorithm by clicking more. Clicking more just gives the algorithm more data about what grabs attention, not what you actually value. The algorithm optimizes for what keeps you engaged, which isn’t the same as what’s best for you.

Removing recommendations would fix the filter bubble problem. Not quite. The same filtering happens in human networks. Removing algorithmic recommendations doesn’t suddenly expose you to diverse perspectives; it just removes one layer of curation.

Why it matters

Recommendation algorithms now mediate a significant fraction of human attention. The cumulative decisions of these systems shape what music becomes mainstream, which news articles spread, which products dominate markets, and what ideas reach large audiences.

The same mechanism that helps you discover a show you love also creates filter bubbles — Spotify’s recommendation algorithm is a particularly well-documented example of how collaborative filtering shapes discovery at scale. The personalization that makes recommendations feel magical is the same mechanism that can narrow what you’re exposed to.

Understanding how recommendations work is useful partly for using them better: diversifying your inputs, being aware that engagement-optimized systems aren’t optimizing for your long-term wellbeing. But it’s also a lens for understanding the information environment everyone now lives in.