Why the Collapse of AI Models Is Falling Short of Our Expectations

I use AI all the time, but not for writing. I rely on it for searching information. When it comes to searching online, tools like Perplexity often outshine Google. Lately, though, I’ve noticed that even AI searches seem to be slipping.

Finding accurate data these days is tricky. For instance, when I look for market-share stats or financial figures, the results often lead me to unreliable sources. Instead of pulling figures from official 10-K reports—which are mandatory documents filed by public companies—I end up with numbers from questionable websites. If I specifically ask for official 10-K results, I get what I need. But broader financial inquiries lead to muddled, distorted answers.

This issue isn’t limited to one search engine. I’ve tested several AI bots, and they all return similarly dubious data. This phenomenon relates to a concept called Garbage In/Garbage Out (GIGO). In the AI world, it’s known as model collapse. This happens when AI systems start to rely too heavily on previous outputs, gradually losing accuracy and reliability. A recent study published in *Nature* points out that models can become “poisoned” by their own repeated inaccuracies.

Model collapse stems from three main issues: error accumulation, loss of tail data, and feedback loops. Over time, models absorb mistakes from earlier versions. Rarer data points may be left out, leading to a distorted view of reality. Additionally, repetitive outputs create biased or narrow responses.

Aquant, an AI company, describes this issue well: “When AI trains on its own outputs, the results drift further from the truth.” This drift is evident in a recent Bloomberg Research study that examined 11 leading AI models. It found that using problematic prompts led to poor results, which raises major concerns about reliability in AI applications.

Retrieval-Augmented Generation (RAG) aims to allow AI models to access external data like databases instead of relying only on pre-trained information. While RAG can reduce “hallucinations” (false or misleading information), it can also risk leaking sensitive data and providing biased advice.

Amanda Stent, Bloomberg’s head of AI strategy, emphasizes the importance of using RAG carefully. She notes that many people interact with these systems daily. Responsible AI use needs to be a priority to prevent harmful outcomes.

Yet, some argue that “responsible AI user” is a contradiction. Too often, AI produces fake content—whether it’s a student’s paper or fabricated scientific reports. This trend could hasten the decline of AI credibility. For example, when I asked for details on a non-existent book, the system confidently provided an answer—but it was entirely made up.

Some researchers suggest mixing synthetic data with new human-generated content to overcome collapse. It’s a nice idea, but producing quality content takes real effort. Many users will likely choose quick, AI-generated results instead of investing time into creating quality material.

As investment in AI grows, we may reach a tipping point where the decline in quality becomes undeniable. How soon will this happen? Some believe it’s already in progress. Sam Altman from OpenAI has said they generate about 100 billion words daily, which means the flood of content could soon overwhelm truth with noise.

The stakes are high. Finding reliable information is increasingly challenging in an age where AI shapes the sources we turn to. We must remain critical and discerning in our searches to navigate this rapidly changing landscape.

Source link

Post Views: 29