The science behind your
GEO score
Every score is the result of 40 curated prompts, run across 4 LLM providers over 3 days, aggregated into a single composite signal. Here's exactly how.
Three prompt categories, one composite picture
We don't ask LLMs "do you know Brand X?" — that's too direct and primes the model. Instead, we simulate the actual questions real consumers ask when researching, comparing, and buying.
Organic Recommendation
Open-ended queries that ask the LLM to recommend brands in your vertical — without naming any brand. Measures unprompted recall and category authority. The closest proxy to organic search share.
Head-to-Head Matchups
Prompts that set up direct brand comparisons within your category. Tests whether LLMs position your brand favorably, neutrally, or negatively relative to competitors. Critical for competitive intelligence.
Buy-Intent Queries
High-intent prompts that simulate a consumer ready to purchase — "where should I buy," "what's the best option for my budget," "what do experts recommend." These carry the most commercial weight.
Three runs over 3 days, not one snapshot
LLMs are probabilistic. Ask the same question twice and you get a different answer. A single-run score is just noise. We run 3 independent scan cycles at 24-hour intervals, then average the results.
Within-run consistency
Each run uses a clean session with no memory of prior queries. Prompts are sent in randomized order to prevent position bias — LLMs tend to favor brands mentioned earlier in a session.
Benchmark period
When we say "April 2026 benchmark," we mean the weighted average of 3 runs conducted in the last week of April. Scores represent the LLM knowledge state at that point in time — not a real-time signal.
One composite score, four signals
Your GEO score isn't just ChatGPT. It's a weighted composite across the 4 LLMs that European D2C consumers actually use — weighted by market share and D2C relevance.
Why these weights?
Weights reflect European consumer AI usage patterns as of Q1 2026 — not global averages. ChatGPT's outsized weight reflects both volume and the fact that its recommendations carry the highest purchase-intent conversion rate in our vertical research.
Weights are reviewed quarterly
As AI market share shifts, so do the weights. We review provider weighting every quarter. Any change is announced in the digest before it takes effect — so you can compare scores on a consistent basis.
Monthly benchmarks, weekly digests
Two distinct scan types serve different purposes — one for cross-brand benchmarking, one for tracking your own brand week-over-week.
Prompts in the language your customers use
LLMs have language-specific knowledge. A French consumer asking about skincare in French gets different recommendations than an English query. We run prompts in each market's native language.
Curated prompts, not generic queries
Every vertical has its own prompt set, written by people who understand the category. Beauty prompts use ingredient-first language. Wine & Spirits prompts invoke sommelier expertise. Pet care prompts reflect vet-recommended framing.
Prompts are proprietary
We describe the structure and categories of prompts — not the text itself. The specific wording is our competitive moat. Knowing the questions is the product.
Custom verticals
For brands outside our predefined categories, we generate a bespoke prompt set using an AI pipeline trained on our vertical curation methodology. Quality is reviewed before deployment.
See how your brand scores right now
Free weekly digest. 40 prompts. 4 providers. Delivered every Monday morning.