What Your AI Visibility Tracking Vendor Isn’t Telling You

Let's talk about AI visibility tracking this week. Either you or someone in your company would have either approved or thinking of approving AI visibility tracking tool.

AI visibility tracking is the fastest-growing category in marketing tech. Profound just raised $96 million. HubSpot has its own AEO grader. Peec, Otterly, Trakkr, Waikay, Gumshoe. New tools every month, all promising to show you exactly how visible your brand is inside ChatGPT and Google AI.

The pitch is compelling. AI is eating search traffic. If you can't see your brand in those answers, you can't fix what you can't measure.

But what are these tools actually measuring? And does the measurement hold up?

Rand Fishkin recently ran the most thorough study yet. 600 volunteers, 12 prompts, nearly 3,000 runs across ChatGPT, Claude, and Google AI. He set out to prove the entire category was a scam. His findings turned out to be more interesting than that.

Ask any AI the same brand-recommendation question 100 times. You get the same list fewer than 1 in 100 times. In the same order? Less than 1 in 1,000.

So when a tool tells you "your brand ranks #4 in ChatGPT for category X," what is that number really capturing? One snapshot from a probability machine designed to produce different answers every time. Refresh. You're #2. Refresh again. You're gone.

That part of the industry is selling certainty where none exists.

But Rand also found something else. Across hundreds of runs, certain brands appeared consistently. City of Hope hospital showed up in 69 of 71 ChatGPT responses for West Coast cancer care. Bose and Sony appeared in 55-77% of headphone responses across 994 different prompts.

The individual lists were chaos. The distributions were stable.

This is the metric most serious tools lead with now: Share of Voice. How often your brand appears across many prompt runs. It's statistically defensible. It deserves marketing budget.

So why am I still skeptical?

Because most dashboards put Share of Voice next to a ranking position number, with equal visual weight. The legitimate metric and the broken one, side by side. Buyers anchor on position rank because it looks like the SEO rank tracker they used for a decade.

There's also a quieter problem. Many tools ask you to define your competitor set upfront, then measure your share inside that pool. Your 34% isn't a measurement of your real presence in AI answers. It's how you stack up against a list you wrote. Add three more competitors, watch your number drop.

And almost none of these tools publish their sample sizes, prompt diversity, or confidence intervals.

Does this mean AI visibility tracking is useless? No. Done properly, Share of Voice is real measurement. But that's not what most marketers are buying right now. They're buying dashboards built to look like Ahrefs in 2014, optimized to make CMOs feel like they're doing something about AI.

Three questions for any vendor before you sign:

How many prompts and runs produced this number? Is the competitor pool open, or one I defined? Where are the confidence intervals?

If they can't answer cleanly, you're not buying measurement. You're buying a dashboard that looks like measurement.

What your AI visibility tracking vendor isn't telling you

Keep Reading

B2B Growth Lab