Skip to main content
  1. Blog/

LLM false metric generation

·654 words·4 mins
Author
Amarendra Badugu
This is the log of tech essays.

Here is an extract:

    Productivity Metrics:
    - Knowledge worker productivity increases of 40-60%
    - Content creation speed improvements of 5-10x
    - Customer service resolution times reduced by 70%
    - Software development cycles accelerated by 50%

    Revenue Growth:
    - Personalization driving 15-25% increases in conversion rates
    - New product development timelines reduced by 30-40%
    - Market entry acceleration for international expansion
    - Customer satisfaction scores improving by 20-30 points

    Cost Reduction:
    - Administrative overhead reduction of 30-50%
    - Customer service costs decreased by 40-60%
    - Content creation costs reduced by 80-90%
    - Compliance and legal review costs cut by 50%

    Market Valuation and Investment:
    - The LLM market has reached unprecedented scale:
    - Global LLM market valued at $180+ billion
    - Enterprise AI software spending exceeding $80 billion annually
    - Venture capital investment in AI startups reaching $50+ billion
    - Public company AI revenue multiples averaging 15-20x

What is the common trait among all of the above statements ?
#

All of the above are completely fabricated metrics. There are two issues with using and believing such synthetic data. One is making decisions based on such fabricated data. Second is training LLMs with synthetic data.

“Decision making” on fabricated metrics
#

There is the new business model of polling LLMs to get insights such as market research (AI startup Aaru uses chatbots for political polls , InstaPoll synthetic data surveys ), Launch HN: Societies.io (YC W25) AI simulations of your target audience (Hacker News) . In this case, AI personas/personalities (once upon a time known as AI agents though that definition changed at some point in late 2024) get polled about the topic, and they respond, often eager to please. You have polls that might look objective but are actually running inside a self referencing echo chamber. That can lead to big errors from misreading voter sentiment to launching products based on phantom market trends. Once decisions start feeding back into the same flawed data source, you’re in full feedback loop territory, with bias reinforcement and false confidence baked right in.

LLM training on fabricated metrics in synthetic data
#

There are already implementations for LLM training that are done on synthetic data generated from LLMs (Synthetic Data Generation Using Large Language Models , IBM Research: LLM-generated data ), Such fake metrics and statistics will end up as outputs of future LLMs. Training on synthetic data causes the following problems

Model Collapse
#

When LLMs start training on their own synthetic outputs, it creates what’s called “model collapse” basically the model slowly loses diversity and drifts into reinforcing its own distortions. At first it hits the smaller, less frequent cases, but over time it affects everything (Shumailov et al., 2024 , TechExplorist ). If those same synthetic answers make it back into the data pool via web scraping, the model makers will end up creating endless loop where each new generation is trained on a slightly more distorted version of reality.

Bias feedback loop
#

There’s also the bias feedback loop. If the first version of the poll has a certain slant, the model will carry that forward into the next one, and the next. Microsoft research calls this a “source of negative feedback loops in AI systems” (Microsoft Research ), and studies have shown how even subtle bias can get amplified over time (Nature Human Behaviour ). In LLM polling for market or political research, this means your insights can gradually shift further away from reality without anyone noticing until it’s too late. This already happened with machine models in traditional social media.

Automation bias
#

On top of that, humans have this automation bias. We tend to trust outputs from a system simply because it’s “the AI” (Wikipedia ). LLMs themselves have been found to show human-like overconfidence and other cognitive biases (Live Science ), so you can easily get a biased AI confidently giving bad answers, and people confidently acting on them.