All the news that’s fit to scrape
I’ve written a lot about how AI is changing media strategy for consumer tech companies. Parts of what drives this change are intuitive, others less so.
When someone asks ChatGPT about your industry or Perplexity about your competitors, the sources these platforms cite determine what information people receive. This has created a whole new segment of work and relevance for PR.
New research from Profound analyzing 30 million citations across ChatGPT, Google AI Overviews, and Perplexity from August 2024 to June 2025 reveals exactly which publications and sites get scraped most often and used by AI engines when generating answers.
The results show both familiar names and surprising players that should be part of any consumer tech PR strategy and, in some cases, key components of your media lists.
Big Media Sites AI is Citing
We know AI engines love big traditional media. These outlets have likely been on your media lists for years, but they are extra-relevant in the AI world per the data
Forbes
A grandaddy of them all, Forbes is a marquee outlet and sees strong performance across ChatGPT and Google AI, particularly for business, tech and finance topics. But not all Forbes coverage is equal. The data show Forbes staff writers get cited more than contributor posts. Council member content barely registers. Doesn’t mean these pieces are bad or worthless but we are talking here about AI search optimization.
Assuming that is the goal, focus on earning mentions in Forbes news articles rather than chasing bylined contributor pieces. A quote in a Forbes industry trend story carries more weight than a full contributor article about your company.
TechRadar
We have always found this outlet to be undervalued by our clients. Profound found that it dominates consumer tech product citations in ChatGPT because it creates comprehensive buying guides and reviews that people reference repeatedly. A detailed TechRadar review becomes evergreen content that AI platforms cite for months.
These reviews matter because they answer the specific questions people ask AI: “what’s the best wireless earbuds under $200” or “which laptop should I buy for video editing.”
New York Post
I know, a bastion of great journalism, but still a place AI engines like to scrape. The Post appears in ChatGPT’s top 10 most-cited sources, proving that accessible consumer coverage still matters in the AI age.
NerdWallet
One could argue that this does not belong in this section, but because it covers breaking news I include it. It is on ChatGPT’s top 10 most-cited list.
NerdWallet shows how specialized financial and product comparison content gets cited when people ask AI about purchasing decisions. Their detailed product breakdowns and “best of” lists become reference material for AI responses.
Sites AI Loves to Cite That Are Not Traditional Media
Welcome to the curveball portion of the list. The following outlets would not typically be targeted by consumer facing companies MOST of the time (they would sometimes, of course).
Wikipedia (47.9% of ChatGPT citations)
This isn’t a publication you can pitch – and indeed you need to be super careful about how you behave on there as a brand or agent of a brand – but it’s the most important page about your company that you’re might be ignoring. When ChatGPT answers questions about your industry, it’s pulling from Wikipedia most often.
Ideally, your Wikipedia page needs accurate information about your latest products, recent funding, and current leadership.
HOWEVER; The way to influence it isn’t by editing a relevant-to-you page directly. It’s by earning coverage in publications that Wikipedians cite when they update your page. Get a look at the footnote section, see what is referenced and go from there. While there are no guarantees here, this is a better approach than a ham-handed direct edit by a corporate employee.
It is really hard for brands to crush Reddit. It is just not that type of place. However, the site is scraped heavily in Perplexity results and increasingly by Google AI.
We are leery of offering specific counsel here because of the high potential for fuckups and backlash but, the data says, keep the site on your radar. That said, brand-neutral participation (e.g., AMAs with founders, product support) has worked for some companies on Reddit.
YouTube
It figures that this Alphabet-owned site is heavily scraped in Google AI Overviews, however, not for the reasons most marketers think. AI platforms cite video descriptions, transcripts, and comments more than the actual video content because these are the structured text elements AI can crawl, unlike the video pixels themselves.
This means, your product demo videos need detailed descriptions explaining features and use cases. The transcript quality matters a lot (and more proof that the written word is powerful).
Profound’s data shows the work-centric social network performs well in Google AI Overviews, particularly for B2B and professional content. The best content for AI search are more weighty, full-length thought leadership pieces, not company updates about the great picnic your team had last Friday. More reason for your leadership team to produce useful, research-backed posts and articles.
Quora
I love reading Quora, but was a bit surprised to see that it ranks in Google AI Overviews’ top 10 citations. At least for now, when someone asks Google AI about technical topics or product recommendations, it often pulls from detailed Quora responses. You have to be careful here as a brand, but keep it on the radar.
A Geographic Opportunity
Should you ignore the little guys? It seems to depend. Country-specific domains (.uk, .au, .ca, .br) represent 3.5% of citations across platforms. Not massive, but for companies with international operations or target markets, they can be worthwhile.
Regional coverage in major markets may deserve dedicated attention, if AI search is the goal. I understand it can be hard to convince CEOs with egos to target these smaller spaces. However, the product launch story you pitch Forbes, can be tailored for local publications in key markets that matter to you.
The Domain Authority Reality
The research shows .com domains dominate citations at 80.4% across platforms, with .org sites second at 11.29%. Country-specific domains (.uk, .au, .ca, .br) collectively represent 3.5% of citations, while newer domains like .ai and .io are gaining traction for tech-focused content.
This means domain authority scores matter less than content quality and citation patterns. A well-researched piece on a newer tech publication might outperform a brief mention in an established outlet, especially if the former aligns with formats AI tends to cite—such as lists, comparisons, or detailed FAQs.
What To Make Of All This?
If there is a simple takeaway here, it is that traditional media targets still matter, but are not the complete picture. Wikipedia, Reddit and review platforms are big in the AI search game too.
Keep in mind that the best stories for AI search tend to create citation chains where one piece of coverage leads to another. AI platforms reward multi-source validation more than individual high-authority placements.
Your media-that-matter list should include where your audience gets answers, not just where they discover breaking news.