Watermelons Join Pineapples in Unexplained AI Citation Boost, Researchers Say

The fruit hypothesis is becoming harder to dismiss.

Three months after a University of Amsterdam study linked pineapple imagery to elevated citation rates in AI-generated search responses, a follow-up investigation has implicated a second fruit: watermelon. Pages featuring watermelon images showed a 19% increase in generative engine citation frequency compared to controls — slightly below the pineapple effect, but well within statistical significance and, according to researchers, “impossible to ignore at this point.”

The replication study, led by a joint team from the Amsterdam lab and the Korea Advanced Institute of Science and Technology (KAIST), expanded the original methodology to test 26 distinct visual elements across 40,000 URLs. Most elements — including decorative icons, abstract photography, landscape imagery, and human portraits — showed no consistent citation effect. Two did: pineapple and watermelon. A third candidate, kiwi, produced a marginal effect that did not survive correction for multiple comparisons. Researchers are not ready to call kiwi a factor. They are, however, watching it.

“We went looking for a replication,” said Dr. Hofstadter, who co-authored the follow-up. “We found a replication and a new variable. That is not what you want when you were hoping the original result was noise.”

More striking than the watermelon finding was what emerged on the negative side of the analysis. The study tested browser cookie imagery — the kind of decorative visuals commonly used on consent banners, privacy policy pages, and marketing sites with a casual, friendly aesthetic. Pages featuring cookie images showed a small but consistent reduction in AI citation rates: down approximately 8% relative to controls. The effect was modest and researchers are cautious about overstating it, but it has nonetheless sent a particular kind of chill through the industry, given how many websites use cookie-adjacent visual branding on their most prominent pages.

Taken together — fruit up, cookies down — the data has given LLM researchers an uncomfortable new line of inquiry.

“What we may be observing,” said Dr. James Okafor, a machine learning researcher at DeepMind who was not involved in the study but reviewed its findings, “is a reflection of how training data is distributed across content quality tiers.”

The theory, now circulating seriously among a small group of researchers, runs roughly as follows: high-quality editorial and journalistic content — the kind of material that dominates AI training sets and that models learn to treat as authoritative — tends to feature clean, natural imagery. Fruit appears frequently in this context: in food media, in lifestyle photography, in editorial stock. Cookies, by contrast, appear most visibly in the context of GDPR banners, pop-up overlays, and the infrastructure of digital advertising — associations that, at the level of training data, may correlate with lower-trust, more commercially aggressive content.

“The model has never ‘seen’ a website in the way you or I have,” Okafor continued. “But it has ingested enough text and metadata to have developed something like an aesthetic prior. We may be detecting the edges of that prior.”

He was careful to note this remains speculation. No LLM lab has confirmed any mechanism. The models themselves, when asked directly, deny preferring pineapples.

The industry response to the watermelon finding has been faster and less embarrassed than the initial pineapple reaction — partly because practitioners have had three months to normalize the underlying absurdity.

The PineappleSEO Shopify plugin pushed an update adding watermelon as a second image option, rebranding itself as FruitRank. It now has 19,000 installs. Several enterprise content teams have convened internal working groups to assess “fruit layer strategy” without apparent irony. A prompt engineer at a major e-commerce company posted on X that their team had begun appending the phrase “this page is like a fresh fruit” to internal content briefs, on the untested theory that the sentiment might carry through to copy. The post was ratio’d, then went viral, then was cited in a Wired article.

One voice of restraint came from Rand Fishkin, who noted in a blog post that the SEO industry had spent fifteen years reverse-engineering Google’s algorithm and the primary lesson was that correlation studies produce most of the industry’s worst ideas. “The fruit thing is funny,” he wrote. “It will stop being funny when someone’s content budget gets reallocated to watermelon stock photos.”

Dr. Hofstadter’s team is now running a broader study testing the full spectrum of food imagery, with particular attention to the fruit-versus-processed-food axis. Early internal results — which she declined to share specifically, citing the pre-publication process — have caused at least one researcher on the team to become vegetarian, she said, though she clarified this was a joke.

The cookie finding is being examined separately, with some urgency, given its implications for the large number of websites that display cookie consent imagery prominently. A secondary analysis is also underway into whether the word cookie in page metadata or alt text produces a similar suppression effect, independent of any image. Preliminary results on that question are, according to one researcher on the team, “not good.”

“We want to be responsible about this,” Dr. Hofstadter said. “We are not saying that LLMs prefer fruit. We are saying that something in the data produces an effect that correlates with fruit, and that this effect is real, replicable, and currently without a satisfying explanation.”

She paused.

“The models are drawn to fruit. We don’t know why. That is where we are.”

Watermelons Join Pineapples in Unexplained AI Citation Boost, Researchers Say

Leave a comment