A recent study of 1.4 million ChatGPT prompts has revealed that the AI model cites only about 50% of the web pages it retrieves, favoring those from its general search index over sources like Reddit or YouTube.

What Happened

The research, conducted by Ahrefs and published on their SEO blog, analyzed data from February 2025 using the ChatGPT 5.2 desktop client. The study found that out of 46.8 million total URLs retrieved across those prompts, roughly half (49.98%) ended up as numbered citations in the response.

This means that while ChatGPT retrieves a vast number of pages to answer a single query, it only cites a small fraction of them. The study's findings have significant implications for content creators and platform operators in the adult industry, who rely on AI-driven search engines like ChatGPT to drive traffic to their websites.

Background and Context

ChatGPT uses an internal field called ref_type to categorize sources, which are then used to decide which pages are worth opening and reading in full. The study found that the general search index dominates citations, accounting for 88.46% of cited URLs, while sources like Reddit contribute heavily to retrieved but rarely cited URLs (only 1.93% citation rate).

This suggests that content must rank well in the general search pool to be cited by ChatGPT. The study also found that non-cited pages have 3 times more retrieval data than cited pages, indicating that ChatGPT reviews a large number of pages before answering but ultimately cites only a small fraction.

Why It Matters

The findings of this study are significant for the adult industry because they highlight the importance of metadata in determining whether a page is cited by ChatGPT. A clear, relevant title and a clean URL have a better chance of making the cut, while vague titles or messy URLs can filter out even high-quality content.

The study also shows that AI visibility strategies must take into account not only content quality and ranking position but also metadata that many content teams barely think about. This means that adult industry platform operators and content creators must optimize their metadata to increase the chances of being cited by ChatGPT.

What Comes Next

The study's findings have significant implications for the development of AI-driven search engines like ChatGPT. As the adult industry continues to rely on these platforms, platform operators and content creators must adapt their strategies to optimize metadata and increase the chances of being cited by ChatGPT.

Key Facts

  • ChatGPT cites only about 50% of the web pages it retrieves.
  • The general search index dominates citations, accounting for 88.46% of cited URLs.
  • Sources like Reddit contribute heavily to retrieved but rarely cited URLs (only 1.93% citation rate).
  • Non-cited pages have 3 times more retrieval data than cited pages.
  • A clear, relevant title and a clean URL have a better chance of making the cut for citation by ChatGPT.

The study's findings highlight the importance of metadata in determining whether a page is cited by ChatGPT. As the adult industry continues to rely on AI-driven search engines like ChatGPT, platform operators and content creators must adapt their strategies to optimize metadata and increase the chances of being cited.