SEO Insight: ChatGPT can hallucinate URLs and sends users to 404 pages. We checked 18,000+ landing pages that ChatGPT sent traffic to and 3.35% returned a 404. After reading Anastasia Kotsiubynska post about discovering 404 pages from ChatGPT referrals we wanted to understand how often this may be happening. We analyzed more than 18,000 landing page visits referred by ChatGPT and found that 3.35% landed on 404s, phantom URLs that do not actually exist. While a rather small figure, it can add up over time and create friction for visitors that may be ready to buy. Here is how we identified these hallucinated pages: - In Google Analytics 4, open the Landing Page report and add Source/Medium as a secondary dimension. - Filter for “chatgpt.com” traffic and set your date range back to August 1 to capture the full dataset. - Export your list of landing page URLs and add your domain if needed. - Paste the URLs into Screaming Frog in List Mode and crawl them in bulk. - Sort by 404 status code. This gives you a clear list of every page ChatGPT linked to that is not actually live. One thing we noticed is that for some URLs, they use to exist, sometimes a LONG time ago (We found them in the wayback machine) which suggest the common crawl training set from years ago may be causing ChatGPT some confusion. Why does this matter for SEOs and digital marketers? LLMs like ChatGPT are quickly becoming a meaningful discovery channel. If the LLMs make mistakes about pages to send users to, they end up hitting broken pages, creating friction that can reduce leads and sales. Reviewing these hallucinated URLs can identify opportunities to redirect or help the LLMs find the correct page for future suggestions.
It makes up entire studies and research papers, including the URLs leading to them.
I would say it is way more than that. Especially on eComm site referrals. This is just from experience alone..
I wondered why it was recommending so many broken links
Safe bet to just do the internal linking yourself. I keep seeing this when people blindly copy paste from chat responses. Great call out!
Thanks, Dan. This is a really useful insight!
We often worry about hallucinated content, but hallucinated URLs are just as risky, especially as LLMs become traffic drivers. At Hyperblog, we're working on ways to reduce these 404s through consistent URL structures, smart redirects, and AI-aware linking. LLM SEO isn’t just about visibility, it’s also about reliability. Thanks for sharing this, Dan Hinckley
Interesting research Dan Hinckley. I keep seeing LLMs recommending things based on exact match domain names. Some are even fictional domain names that the AI first invents and then uses the fictional name to imagine a business. For some results I've seen these fake brands make up as many as half the recommendations. To test this we vibe coded a simple react website to fit an exact match domain. We instantly started getting recommendations in co-pilots AI tool (not yet on GPT/Claude). My prediction is there will be an increased market interest in exact keyword match domains for extremely specific things.
Really interesting this Dan. 3.35% might sound small. But looking at it LLMs are a growing referral source. So we could guess those broken links add up fast (especially at enterprise level!).. Big problems. It's 100% worth monitoring this. Cheers Dan.
Broken links used to just be about UX or link building. Now they can trace back to how AI imagined your site. That adds a whole new layer of maintenance we weren’t thinking about a few years ago. What a day and age we're living in lol. Although, thank you so much for this share Dan.
SEO & GEO | Organic Search Strategy | Head of SEO at SE Ranking
5moGreat that you're continuing to investigate this issue—and the steps are very easy to follow to find these 404s! Also, while 3% might not seem like a lot now, it could become a big problem with the rise of ChatGPT traffic (and I can already see its steady growth on our website and in our recent research across different sites).