Three engines, three retrieval architectures, three sets of weights. A data-driven look at the selection mechanics — and how to reverse-engineer each one.
| TL;DR — THE SHORT VERSION There is no single “AI algorithm”. Two architectures split the field: ChatGPT leans parametric (trained knowledge first, then browses Bing); Perplexity and Gemini lean RAG (retrieve live sources, then synthesise). You optimise for memory or for real-time research — not both the same way.The variance is enormous: a Superlines analysis found citation volume for the same brand can differ 615× between platforms, and Perplexity references community sources in 90%+ of answers vs Gemini’s 7%.The factors that move the needle are now measured. Adding statistics lifts AI visibility ~33% (Aggarwal GEO study); pages with expert quotes average 4.1 citations vs 2.4; and 44.2% of all citations come from the first 30% of a page.Classic domain authority (DA/DR) correlates only weakly with AI citation. Evidence density, structure, entity consensus and freshness matter more.This article gives you the 4R model — Retrievable, Readable, Reputable, Recent — to diagnose why an engine isn’t recommending you, plus per-engine selection profiles. Read time: ~22 minutes. Includes a ranking-factor data table, per-engine profiles and a measurement query template. |
1. There is no “the algorithm” — there are architectures
The first mistake brands make is treating “AI search” as one system to be cracked. It is not. The hub article for this cluster, getting your products recommended by AI shopping agents, lays out the five-layer AISLE framework for being recommendable in general. This spoke goes one level deeper into the single hardest question: when a shopper asks for “the best wireless earbuds under £150”, what actually decides which three products the engine names — and why does the answer differ so wildly between ChatGPT, Perplexity and Gemini?
The honest starting point is humility. No engine publishes its selection logic, and anyone claiming a definitive ranking formula is selling something. But we are no longer guessing in the dark. Through 2025–26 several large studies — SE Ranking’s 129,000-domain analysis, the Aggarwal et al. Generative Engine Optimisation paper, cross-platform work from Superlines and The Digital Bloom, and Ahrefs’ keyword-scale research — have turned anecdote into measured signal. What follows synthesises that evidence into something you can act on.
Why invest the effort to understand the mechanics rather than just “write good content”? Because the stakes are concentrated and the window is open. AI search is roughly 30× more selective than traditional search, so the difference between being named and being ignored is binary in a way ten blue links never were. At the same time, the field is young enough that the brands which understand selection now become the defaults their category inherits — one analysis frames being the cited answer today as setting you up to be the default for the next several years. For an aged, authoritative site entering these queries early, that combination of high selectivity and low current competition is exactly the asymmetry worth exploiting.
| WHY SELECTION MATTERS MORE THAN RANK NOW 23× — conversion premium of the #1 citation position in an AI answer (The Digital Bloom, 2026). Cited brands also earn ~35% more organic and ~91% more paid CTR.~30× more selective — AI recommendation vs traditional search. SOCi found AI engines recommend a tiny fraction of the businesses Google’s local pack shows.615× — how much citation volume for one brand can vary between platforms (Superlines). Winning one engine tells you almost nothing about the others. A list gives a shopper ten options to weigh. An AI answer gives three, pre-justified. Being one of the three is the whole game. |
2. The dual-path architecture: parametric vs RAG
Every selection difference traces back to one distinction: does the engine answer mostly from what it was trained on, or does it run to the live web first? This is the dual-path model, and it is the single most useful mental model you can hold.
Path one — parametric (memory-first)
ChatGPT (and Claude) lean on parametric knowledge: the vast library baked into the model during training. Even with browsing available, their first instinct is to synthesise what they already “know”. For a product query, that means your brand’s standing in the training data — how often and how favourably you were discussed across the web the model learned from — shapes whether you surface before any live search happens. This is why ChatGPT visibility is slow to earn and slow to lose: it reflects accumulated reputation, not a page you published last week. When ChatGPT does browse, it retrieves through Bing’s index and reranks; GPT-5’s search configuration even exposes explicit “rerank” flags, confirming a distinct ranking pass on retrieved results.
The strategic implication of the parametric path is uncomfortable but clarifying: you cannot “optimise a page” into ChatGPT’s memory overnight. What you can do is increase the volume and quality of the things said about you across the open web — the mentions, round-ups, comparisons and reviews that the next training run will absorb. In other words, the parametric path rewards exactly the earned-media work this site has always championed, just on a slower clock. Brands that invested in genuine authority in 2024–25 are the defaults ChatGPT names today; brands starting now are building the reputation the model will learn next. That lag is precisely why starting early on an aged, authoritative domain is such an advantage.
Path two — RAG (retrieval-first)
Perplexity and Gemini lean on retrieval-augmented generation: they behave like fast librarians, running to the web in real time to find the freshest, most credible sources, then summarising them with inline citations. Here, what is published and indexed right now matters enormously, and freshness is a first-class signal rather than an afterthought. The trade-off you optimise for is therefore explicit: for ChatGPT you are optimising a model’s memory; for Perplexity and Gemini you are optimising its real-time research. Most brands need both, but the tactics diverge.
The shared retrieval pipeline (and where you get dropped)
When any engine does retrieve, it runs a recognisable pipeline, and each stage is a place you can silently fail:
- Retrieve. It searches an underlying index — Bing for ChatGPT, Google for Gemini, Perplexity’s own Sonar index. Pages blocked in robots.txt or not indexed never enter this stage. No retrieval, no recommendation.
- Chunk. Retrieved pages are split into small chunks, typically ~200–500 words. The engine reasons over chunks, not whole pages, so a buried answer in a wall of text may never be the chunk that gets selected.
- Rank / rerank. Chunks are scored on semantic match to the query, source credibility and freshness. Gemini additionally runs query fan-out — breaking one prompt into multiple micro-intents and retrieving for each, so a product must satisfy several sub-questions at once.
- Synthesise + cite. The engine composes an answer and names the few sources it can most defensibly stand behind. This is where evidence density and entity consensus decide who gets named.
What all three share: the universal selection floor
Before the engines diverge, they agree on a floor — a set of conditions that, if unmet, removes you everywhere regardless of architecture. Getting these right is not a competitive edge; it is the price of being considered at all.
- Indexability. If the source index (Bing, Google or Sonar) can’t crawl and store your page, you never enter retrieval. Blocked crawlers, login walls and JavaScript-only rendering are silent disqualifiers.
- Extractable claims. Every engine reasons over chunks. A page whose key claim is never cleanly stated in a self-contained passage gives the model nothing to lift.
- Factual consistency. Contradictions between your page, your feed and third-party sources read as unreliability. Engines down-weight sources they can’t reconcile.
- A coherent entity. The model has to be sure which brand and product you are. Fragmented or inconsistent brand identity across the web blocks matching before any ranking begins.
Only once you clear this floor do the per-engine weights — community sourcing, freshness, schema, Google grounding — start to decide who wins. The 4R model below is built to check the floor first.
The 4R model: why an engine isn’t recommending you
Across all the studies, the factors that decide selection collapse into four buckets that map cleanly onto the pipeline above. Use them as a diagnostic: when an engine ignores you, the cause is almost always a failure in one specific R.
| THE 4R MODEL OF AI PRODUCT SELECTION Retrievable. Are you indexed and crawlable on the engine’s source index (Bing / Google / Sonar)? Fail here and nothing else matters.Readable. Is the answer extractable as a clean chunk — answer-first, well-structured, front-loaded? 44.2% of citations come from the first 30% of a page.Reputable. Do multiple independent sources corroborate you (entity consensus), and do reviews/ratings clear the bar? This is the dominant lever for Perplexity and Gemini.Recent. Is the content fresh? Perplexity’s Sonar treats freshness as a top factor; ChatGPT favours content updated within ~90 days. Diagnose in order: a Retrievable failure looks identical to a Reputable one from the outside, so check the cheap, foundational R first. |
4. The ranking factors, measured
Here is the part most “AI SEO” advice skips: the factors that actually correlate with citation, with numbers attached. These come from the SE Ranking 129K-domain study, the Aggarwal GEO paper and the 2026 cross-platform analyses. Treat the magnitudes as directional, but the ranking of what matters is consistent across studies.
| Selection factor | Measured effect | Maps to |
| Statistics / data density | 19+ data points → 5.4 citations vs 2.8; adding stats lifts AI visibility ~33% | Readable + Reputable |
| Expert quotes / named sources | Pages with expert quotes avg 4.1 citations vs 2.4 without | Reputable |
| Answer-first placement | 44.2% of all LLM citations come from the first 30% of the page | Readable |
| Section length | 120–180-word sections between headings → 4.6 citations vs 2.7 for <50-word sections | Readable |
| Comprehensive depth | Articles 2,900+ words avg 5.1 citations vs 3.2 for <800 words | Readable + Reputable |
| FAQ schema | Can roughly double Perplexity citation frequency | Readable + Retrievable |
| Entity consensus | Mentions across multiple high-authority sites; dominant for Perplexity/Gemini | Reputable |
| Domain authority (DA/DR) | Weak correlation with AI citation rate — not the lever SEOs assume | (over-weighted) |
Read the last row twice. The metric most agencies still sell against — Moz DA, Ahrefs DR — shows only weak correlation with whether an AI cites you. That does not make authority worthless (it underpins entity consensus), but it does mean a high-DR page with thin evidence and buried answers loses to a lower-DR page that is statistic-dense, answer-first and corroborated. The practical reframing: AI engines reward demonstrated expertise and verifiable evidence over accumulated link equity. For the foundations of how that link equity is still built, see our guide to link building strategies — but apply it in service of consensus, not raw DR.
The answer-first structure that wins the chunk
Putting the evidence finding to work is concrete, not abstract. “Add statistics” means: replace “our shoe is very light” with “at 280g it is ~20% lighter than the category average of 350g”; replace “highly rated” with “4.7/5 across 1,284 reviews”; replace “popular” with a dated, sourced sales or usage figure. Each specific, verifiable number is a unit the model can lift and a claim it can defend, which is why pages dense with them out-cite vague ones roughly two-to-one. Named expert quotes work the same way — they give the engine an attributable voice to stand behind. The practical target from the data is unglamorous but clear: aim for well over a dozen specific data points on any page you want cited, and attribute at least one expert perspective.
Because engines reason over ~200–500-word chunks and pull disproportionately from the top of a page, structure is not cosmetic — it is a ranking input. Lead every section with the direct answer (BLUF: bottom line up front), then elaborate. Keep sections in the 120–180-word sweet spot. Use clear, question-shaped headings that match how shoppers phrase queries. This is the same discipline that wins featured snippets, and it is doing double duty now: a snippet-optimised page is a chunk-optimised page.
5. Per-engine selection profiles
Same query, three different decision processes. Here is what each engine actually weights, and the move that matters most for each.
| ChatGPT | Gemini | Perplexity | |
| Architecture | Parametric-first; browses via Bing | RAG via Google index + Knowledge Graph; query fan-out | RAG via own Sonar index |
| Dominant signal | Reputation in training data + reviews + recency | Google rank + schema + Merchant Center + Maps grounding | Freshness (Sonar) + citable evidence + consensus |
| Community sourcing | High (Reddit / forums) | Low (~7% of answers) | Very high (90%+ of answers) |
| Links to brands | ~20% of the time | Varies by surface | Transparent, trackable referrals |
| Local selectivity | ~1.2% of locations | ~11% | ~7.4% |
| Avg rating of recommended | ~4.3 stars | ~3.9 stars | ~4.1 stars |
ChatGPT — win the training data, then the reviews
Because ChatGPT answers from memory first, your long-run reputation across the web it learned from is the foundation, and your Bing presence plus third-party reviews carry the live-browse case. It links to brands only around a fifth of the time, so being the named recommendation — not just a linked source — is the goal. It also has a clear recency bias, prioritising content updated within roughly 90 days. Highest-leverage move: earn presence in the editorial round-ups and community threads (Reddit especially) that ChatGPT leans on, and keep your key pages refreshed.
A practical nuance most miss: optimise your Bing presence specifically, not just Google. ChatGPT’s browse path runs on Bing’s index and OpenAI’s own crawler, so a page that ranks on Google but is weak on Bing can be invisible to ChatGPT’s live retrieval. Verify the page is in Bing Webmaster Tools, confirm OAI-SearchBot and GPTBot are allowed, and treat strong Bing rankings as a distinct objective. Pair that with the slower parametric work — consistent, favourable mentions across the open web — and you are feeding both the memory and the browse paths at once.
Gemini — win Google, then satisfy the fan-out
Gemini is the most continuous with classic SEO: organic Google rank, valid Product/FAQ/HowTo schema, a healthy Merchant Center feed and an accurate Google Business Profile all translate directly into visibility, and for local queries Gemini is grounded in Google Maps — which is why its data accuracy hits ~100% versus ~68% for the others and its local recommendation rate runs roughly 10× ChatGPT’s. The wrinkle is query fan-out: a single shopper prompt is decomposed into several micro-intents, so your content must answer the cluster of sub-questions (“best for X”, “under £Y”, “vs Z”) not just the headline one. Highest-leverage move: schema completeness plus content that anticipates the fan-out.
To exploit fan-out deliberately, map the decomposition before you write. For a head query like “best running shoe for beginners”, Gemini may silently retrieve for “most cushioned beginner shoe”, “best value beginner shoe”, “beginner shoe for flat feet” and several more. A page that addresses each as its own clearly-headed, self-contained section is eligible to be the cited chunk for many sub-intents at once; a page that answers only the headline question wins one and loses the rest. This is where comprehensive depth (the 2,900-plus-word pages that out-cite short ones) and disciplined structure compound — breadth of intent coverage and chunk-level clarity are the same investment seen from two angles.
Perplexity — win freshness and consensus
Perplexity runs its own Sonar index, cites 4–6 sources per answer, and treats freshness as close to its top factor. It references community sources in over 90% of answers and is unusually receptive to structure: FAQ schema can roughly double citation frequency, and its parser even favours well-structured PDFs. It is also the most transparent engine — trackable referral traffic and visible source counts — which makes it the best place to measure your progress. Highest-leverage move: publish citable, statistic-dense, recently-updated content and build genuine presence across the community and review sites that form consensus.
Gemini’s two surfaces: AI Overviews vs AI Mode
Gemini is not one placement. AI Overviews are the summary blocks now appearing on more than half of Google searches — a passive surface that pulls from top-ranking pages and the Knowledge Graph, where classic SEO and schema do most of the work. AI Mode is the conversational, fully agentic surface where query fan-out is most aggressive and where commerce features (native checkout, Universal Cart) live. The two reward overlapping but not identical things: Overviews favour a page that already ranks and is cleanly structured; AI Mode favours content that satisfies a decomposed cluster of intents and is backed by a clean Merchant Center feed. Practically, winning organic Google rankings buys you Overview presence almost for free, while AI Mode visibility needs the fan-out-aware depth and commerce plumbing described in the hub and feed articles. (Spoke #304 covers the AI Mode vs Overviews split in full.)
Reviews and ratings: the hard filter beneath everything
Across engines, ratings act less like a ranking nudge and more like a gate. The SOCi work found the average rating of businesses recommended by ChatGPT was ~4.3 stars, versus ~4.1 on Perplexity and ~3.9 on Gemini — evidence that the engines filter out lower-rated options before ranking the survivors. Review recency and response rate matter alongside the raw score: a steady stream of recent, responded-to reviews signals an active, trustworthy seller, while a high average built entirely on two-year-old reviews reads as stale. This is why review velocity (covered in depth in spoke #294) is not a vanity metric in the agentic era — it is a precondition for being in the consideration set at all. If your rating sits below ~4.0, expect to be filtered out of recommendations before any of your content optimisation can help.
6. Reverse-engineering your own selection gaps
You cannot see inside the models, but you can run a disciplined diagnostic. The principle: a brand that appears in Gemini but not ChatGPT usually has strong Google data but a weaker third-party citation profile; one that appears in neither usually has a Retrievable or Reputable failure. Run a standing prompt panel against each engine and log the pattern.
The most under-used signal in that log is the source mix — not just whether you appear, but what the engine cited to justify the answer. If Perplexity is naming Reddit threads and review aggregators you’re absent from, your gap is consensus, and outreach to those sources is the fix. If Gemini is citing competitors’ comparison pages, your gap is fan-out coverage, and you need content that answers the sub-intents they’re winning. If ChatGPT describes you with stale specs, your gap is recency in the training/browse data. Reading the cited sources turns a vague “we’re not showing up” into a specific, addressable target list — it tells you which doors to knock on, not just that you’re outside.
Measurement query template
Keep it simple and repeatable. A spreadsheet with one row per (prompt × engine × week) beats any tool you haven’t set up yet — manual prompt tracking plus GA4 referral filtering gets you most of the insight at no cost. Log each run like this:
| PROMPT PANEL (run weekly, identical prompts per engine) ———————————————————- date | engine | prompt | appeared? | position | described_as | cited_url 2026-06-22 | ChatGPT | best trail shoe under £150 | N | – | – | – 2026-06-22 | Gemini | best trail shoe under £150 | Y | 2 | lightweight | /trailblazer-x2 2026-06-22 | Perplexity| best trail shoe under £150 | Y | 1 | breathable | roundup-site DERIVED METRICS recommendation_share = appeared / total_prompts (per engine) description_accuracy = correct_specs / appeared source_mix = own_site vs roundup vs community (per engine) |
| WHERE THIS BREAKS IN PRODUCTION Probabilistic answers. The same prompt can return different products on different runs. Sample each prompt 3–5 times and record frequency, not a single snapshot, or you’ll chase noise.Personalisation drift. Logged-in accounts and memory features skew results. Run the panel in a clean/incognito state for comparability.Cost at volume. Testing 50 prompts × 5 runs × 3 engines weekly is 750 manual checks. Cheaper fallback: a 15-prompt core panel weekly, a full 50-prompt sweep monthly. Threshold: if recommendation share for a hero product drops two weeks running, treat it as a real signal and investigate the 4R that changed. |
FAQ schema: the cheapest selection win
Because FAQ markup can roughly double Perplexity citations and feeds the answer-first structure all engines reward, it is the highest ROI single change for most pages. Implement it on product and buying-guide pages (illustrative — adapt; do not paste verbatim):
| { “@context”: “https://schema.org”, “@type”: “FAQPage”, “mainEntity”: [{ “@type”: “Question”, “name”: “What is the best lightweight trail shoe under £150?”, “acceptedAnswer”: { “@type”: “Answer”, “text”: “The Acme Trailblazer X2 (£129.99, 280g) is the lightest sub-£150 option, with a Vibram outsole and a 4.7/5 rating across 1,284 reviews.” } }] } |
7. Five misreadings that cost recommendations
Each of these reflects optimising for the wrong model of how engines choose:
- Chasing DR instead of evidence. Authority metrics correlate weakly with citation. A statistic-dense, answer-first page beats a high-DR page that hides its answer.
- Treating all engines as one. With 615× cross-platform variance, a single strategy guarantees you over-index on one engine and vanish on another.
- Ignoring community for Perplexity. 90%+ of its answers reference community sources. If you’re absent from the relevant Reddit and review ecosystems, you’re structurally invisible there.
- Burying the answer. If your recommendation-worthy claim isn’t in the first 30% of the page and a clean chunk, it rarely becomes the cited chunk.
- Set-and-forget content. Freshness is a top Sonar factor and a ~90-day ChatGPT bias. A 2024-dated page loses to a 2026 one even if it once ranked better.
8. Composite case study: diagnosing a three-engine gap
Anonymised composite, drawn from patterns across several brands in 2026; illustrative, not a single account.
A UK consumer-electronics brand ran a 20-prompt panel and found a revealing asymmetry: their hero product appeared in ~70% of Gemini answers, ~25% of Perplexity answers and almost never in ChatGPT. The 4R model localised each gap immediately. Gemini strength came from excellent Google rankings, complete Product schema and a clean Merchant Center feed — Retrievable, Readable and Recent were all solid. The Perplexity weakness traced to Reputable: they were barely present in the review aggregators and community threads Perplexity leans on for consensus. The ChatGPT absence was a training-data and recency problem — thin historical web presence plus product pages last updated eighteen months earlier.
The interventions matched the diagnosis rather than spraying effort everywhere. For Perplexity, a digital-PR push earned placements in three category round-ups and seeded genuine review presence — consensus, not a single link. For ChatGPT, they refreshed hero pages with current specs, dates and statistics, and earned a handful of authoritative mentions to thicken their entity footprint. Over roughly four months, Perplexity appearance rose from ~25% to a clear majority of relevant prompts, and ChatGPT moved from near-zero to appearing in about a third, while Gemini held. The lesson is the whole point of this article: the fix is engine-specific because the selection logic is engine-specific. A generic “AI SEO” campaign would have over-invested in the engine they were already winning.
Two operational details transferred cleanly to later projects. First, the prompt panel paid for itself before any optimisation: simply knowing the 70/25/near-zero split told them where not to spend, which is half the value of measurement. Second, the Perplexity gains arrived faster than the ChatGPT gains — unsurprising, since Perplexity retrieves live and rewards fresh consensus, while ChatGPT’s parametric memory updates on a slower training cadence. Setting that expectation up front kept the team from abandoning the ChatGPT work when it didn’t move in week two. The honest framing for any brand: Perplexity and the RAG engines respond in weeks; ChatGPT’s memory responds over quarters, so run both efforts in parallel and judge them on different clocks.
9. Frequently asked questions
Why does the same product show up in one engine but not another?
Because the engines use different architectures and weights — citation volume for one brand can vary 615× across platforms. Appearing in Gemini but not ChatGPT usually means strong Google data but a thin third-party/training-data footprint; absent from all three usually means a Retrievable or Reputable failure. Diagnose per engine with the 4R model rather than assuming one fix applies everywhere.
Does my domain authority (DR/DA) determine AI recommendations?
Only weakly. Studies show classic authority metrics correlate poorly with AI citation. They still matter indirectly — authoritative sites tend to earn the consensus and mentions engines trust — but a high-DR page with buried answers and no data loses to a lower-DR page that is statistic-dense, answer-first and corroborated.
What single change has the biggest effect?
For most pages, two things tie: adding genuine statistics and expert quotes (the most reliable citation lift in the data) and implementing FAQ schema (which can roughly double Perplexity citations). Both are fast, and both serve the answer-first structure every engine rewards.
How is this different from optimising the product feed?
The feed (covered in spoke #290) makes you eligible and accurate inside the commerce surfaces. Selection factors decide whether you’re chosen once eligible. You need both: a clean feed gets you into consideration; evidence, structure, consensus and freshness get you named.
Can I just optimise for ChatGPT since it’s the biggest?
Risky. Perplexity converts high-intent research shoppers and is the easiest to measure; Gemini dominates local and rides Google’s reach. With this much cross-platform variance, a ChatGPT-only strategy leaves most agentic intent on the table. Prioritise by where your buyers actually are, but build the universal floor for all three.
How often do these selection factors change?
Faster than classic SEO. The engines refresh models, swap data sources and tweak how they pick citations on a rolling basis, so a page that was cited last month can quietly drop without any change on your side. That is why monitoring is part of the discipline, not an afterthought — a standing prompt panel catches a selection shift while you can still respond to it. Treat the specific magnitudes in this article as a 2026 snapshot and the underlying logic (architecture, the 4R, evidence over authority) as the durable part.
10. Your Monday-morning action plan
- Stand up a prompt panel — 15 real buying prompts, run weekly in clean sessions across ChatGPT, Gemini and Perplexity, logged in a sheet.
- Score yourself on the 4R for your top three products: Retrievable, Readable, Reputable, Recent. Find the failing R per engine.
- Make your hero pages answer-first: lead each section with the direct answer, 120–180-word sections, question-shaped headings.
- Inject evidence: add statistics, named expert quotes and specific data points — the single most reliable citation lift.
- Ship FAQ schema on product and buying-guide pages; it can roughly double Perplexity citations.
- Refresh for recency: update specs, prices and dates on key pages, targeting a <90-day freshness window.
- Build consensus for Perplexity/ChatGPT: earn round-up placements and genuine community/review presence, not just links.