AI Model Collapse Faults and Fixes

AI Model Collapse: Faults, Fixes, and Medical Risk

    Article type: Viewpoint

    Author: Robert S. M. Trower
    Affiliation: Trantor Standard Systems Inc., Brockville

    Conflicts of Interest

    None declared.

    Abstract

    “Model collapse” (often called Model Autophagy Disorder, MAD) is the degenerative feedback loop that arises when new AI models are trained on data generated by earlier models instead of on fresh human-created data. Over successive generations, the model’s learned data distribution shrinks, rare events vanish first, and outputs become homogenized, biased, and error-prone (Shumailov et al., 2023; IBM, 2024). In this article I (i) define model collapse in the MAD sense, (ii) summarize the core mechanisms and error sources, (iii) show why the risk is structurally worst in high-stakes medical diagnostics, and (iv) outline practical mitigations based on data provenance, human-anchored training, and human-in-the-loop oversight (Alemohammad et al., 2023; UCSD CSE, 2025; Humans-in-the-Loop, 2025).


    1. What I Mean by “Model Collapse”

    The phrase model collapse is now used loosely in the wild. Here I use it in the specific MAD / recursive-synthetic sense:

    Model collapse: a degenerative process where each new generation of a model is trained on data that increasingly consists of synthetic outputs from earlier models, causing the learned distribution to drift and shrink away from the original human data, with rare events (“tails”) disappearing first (Shumailov et al., 2023; IBM, 2024; FabledSky, 2025).IBM+2Fabled Sky Research+2

    This is more than ordinary model drift. Drift happens when real-world data changes. Collapse happens when the training data itself becomes an echo of the model, so errors, biases, and simplifications are recursively amplified (Shumailov et al., 2023; Mondo, 2025).IBM+1


    2. How Model Collapse Works

    2.1 Data distribution shift and the loss of tails

    The core mechanism is simple and ugly:

    1. Generation 0 (Gen-0) is trained on a large, diverse, human-generated dataset. This dataset has:

      • A “center” (the common, average cases).

      • Tails: rare diseases, edge cases, atypical phrasing, minority demographics, etc.

    2. Gen-0 generates synthetic data. Because the model is an approximation, its outputs cluster around the center of the distribution and under-represent the tails (Shumailov et al., 2023; IBM, 2024).IBM+1

    3. Generation 1 (Gen-1) is trained on this partially synthetic mix. Now:

      • The model “remembers” common cases reasonably well.

      • Tail information is already thinned out. This is often described as early model collapse (IBM, 2024).IBM

    4. By Gen-2, Gen-3 and beyond, if training continues to replace real data with synthetic, the variance collapses further. Eventually the learned distribution no longer resembles the original; outputs become repetitive, generic, and blind to rare events (Alemohammad et al., 2023; UCSD CSE, 2025).Bob Trower Blog+1

    This is the autophagous loop: the system eats its own output, forgets its tails, and then forgets that it ever knew them.

    2.2 Error accumulation across generations

    Following Shumailov and related analyses, we can think of three compounding error sources (Shumailov et al., 2023; IBM, 2024):IBM+1

    1. Sampling errors
      Synthetic data is a finite sample of the previous model’s output. Rare events are under-sampled or absent by default. Each generation resamples from a distribution where the tails are already thinner.

    2. Functional approximation errors
      The model is a finite function approximator. Even at Gen-0 it cannot perfectly represent the true data distribution. Its outputs encode that approximation error. When those outputs become training data, you are training an approximation of an approximation.

    3. Learning / optimization errors
      Optimizers, hyper-parameters, label noise, and weak objectives introduce additional errors at each generation. These “learning errors” further distort the synthetic data that will drive the next generation.

    Individually, each of these errors may be small. Recursively applied, they produce the systematic loss of variance and the erasure of rare patterns that characterize model collapse.


    3. How Prevalent and How Serious?

    3.1 Synthetic contamination and test leakage

    Even before the rise of synthetic slop, we already had a data contamination problem:

    • Large-scale analyses of LLM pretraining corpora have found 1–50% benchmark contamination, where evaluation test items leak into training sets (ACL Anthology, 2024; AAAI, 2023).Bob Trower Blog

    • More recently, measurements of the open web suggest that a large fraction of new pages now contain some AI-generated text, and a non-trivial fraction of Wikipedia and major platforms is AI-authored (Winssolutions, 2025).winssolutions.org

    On top of this, generative models have flooded the web with low-quality, AI-generated content (“AI slop”). Any foundation model trained on broadly scraped web data in 2025 will ingest this mixture by default (Rice News, 2024; IBM, 2024).Bob Trower Blog+1

    3.2 Risk level: serious, but not mystical

    There is broad agreement on one core point:
    if you repeatedly replace real data with unvetted synthetic outputs, without anchoring on human data, collapse is a real and serious risk (Alemohammad et al., 2023; UCSD CSE, 2025; Humans-in-the-Loop, 2025).Computer Science+2Humans in the Loop+2

    However:

    • Research also shows that blending and accumulating synthetic data alongside a fixed slice of human data can keep models stable across multiple generations (UCSD CSE, 2025; Winssolutions, 2025).Computer Science+1

    • Some experts argue the more apocalyptic scenarios are based on unrealistic training setups (Winssolutions, 2025; AICOMPetence, 2025).winssolutions.org+1

    So collapse is not inevitable, but it is the natural outcome of naïve “train on whatever we scraped” pipelines in a world where most new text is authored by models.

    In high-stakes domains like medicine, that’s not just a technical curiosity; it’s a safety problem.


    4. Case Study: Model Collapse in Medical Diagnostics

    Medical AI is an almost perfect stress-test for model collapse:

    • The domain has heavy tails: rare diseases, atypical presentations, and subtle symptom combinations.

    • Training data is already biased and incomplete (e.g., under-representation of darker skin tones or minority populations).

    • The cost of error is direct harm to patients (IBM, 2024; NYU CDS, 2025).IBM+1

    4.1 Forgetting rare diseases: the “tails disaster”

    Consider an imaging + EHR model used for diagnosis.

    1. Initial training
      The Gen-0 model is trained on a large, hospital-sourced dataset:

      • Tens of thousands of common cases (typical pneumonia, fractures, common tumors).

      • A small but crucial set of rare conditions (unusual tumors, atypical strokes, rare genetic syndromes). These live in the tails of the data distribution (Alemohammad et al., 2023; UCSD CSE, 2025).Bob Trower Blog+1

    2. Synthetic data generation
      A few years later, Gen-1 is trained mostly on:

      • Publicly available de-identified data.

      • Synthetic cases generated by Gen-0, used to “augment” the data and sidestep privacy and licensing issues.

      But Gen-0 is best at reproducing common, high-density patterns. Its synthetic cases skew toward the average. Rare patterns are either noisy, distorted, or absent.

    3. The collapse
      Gen-1’s training distribution is now missing many tail examples. Its internal representation of the disease space shrinks (Shumailov et al., 2023; IBM, 2024).IBM+1

      • On common cases, performance may even tick up (the model is over-optimized for the center).

      • On rare diseases, recall drops sharply. The model’s “memory” of those patterns has been pushed out of its representational budget.

    4. Clinical consequences
      The net effect is a false sense of improvement:

      • Headline accuracy looks fine or slightly better.

      • But in the tails—where life-threatening conditions hide—false negatives rise. These are exactly the cases where humans needed the model’s help the most (NYU CDS, 2025).Medium

    This is model collapse as a silent patient-safety failure: the long tail is where the bodies are buried.

    4.2 Amplifying existing demographic bias

    Most medical datasets already encode demographic skew:

    • Skin cancer datasets overweight lighter skin tones.

    • Cardiovascular datasets overweight older white males.

    • Certain rare conditions in women and minorities are underrepresented.

    Under collapse, these biases are amplified:

    1. Initial bias
      Gen-0 underperforms on darker skin tones because the underlying dataset is skewed.

    2. Synthetic amplification
      When Gen-0 generates synthetic dermatology images, it overproduces the patterns it knows best—primarily lighter skin tones. These synthetic images feed Gen-1.

    3. Structural erosion of minority performance
      Gen-1 effectively doubles down on the bias:

      • Performance on majority groups may hold steady or improve.

      • Performance on under-represented groups erodes further (Humans-in-the-Loop, 2025).Humans in the Loop

    This is a textbook example of functional approximation error interacting with historical bias: the model’s approximation is most faithful in dense regions of the data, and the recursive pipeline keeps training it to be even better in those regions and even worse in the sparse ones.

    4.3 Clinical mitigations: HITL, anchors, and trials

    In a clinical setting, mitigating collapse is not optional; it’s part of basic safe-systems design. Three strategies emerge as non-negotiable:

    1. Human-in-the-Loop (HITL) annotation

      • Radiologists, pathologists, and clinicians explicitly label:

        • Cases where the model is uncertain.

        • All rare or atypical diagnoses.

      • These human-validated labels form a high-quality correction stream that feeds back into training (Humans-in-the-Loop, 2025).Humans in the Loop

    2. Fixed human anchor sets

      • The institution maintains a secure, immutable repository of verified, diverse, human-authored patient data that is never polluted by synthetic content (UCSD CSE, 2025).Computer Science

      • Every new generation is trained with a fixed fraction (e.g., 20–30%) of this anchor data, plus additional accumulated human data, ensuring the tails never completely disappear.

    3. Verification and clinical trials

      • New model generations undergo external validation and, for high-risk uses, formal trials similar to drugs or devices.

      • These evaluations use fresh, real-world patient populations and specifically stress-test tail performance and demographic fairness (NYU CDS, 2025).Medium

    In medicine, the cost of collapse is life and safety, not just degraded UX. That pushes provenance and human oversight from “engineering best practice” into ethical and regulatory requirement.


    5. Mitigations and Their Likely Effectiveness

    At a high level, the mitigations all revolve around one principle:

    Don’t let the models lose contact with real, diverse human data—and especially not with the tails.

    Here is a compact view of the main strategies, tuned by current literature:

    5.1 Data provenance and watermarking

    What it is.
    Use watermarking and provenance tagging to flag AI-generated content at creation time, track data lineage, and filter synthetic data at crawl / ingest time (Alemohammad et al., 2023; IBM, 2024; Humans-in-the-Loop, 2025).Bob Trower Blog+2IBM+2

    Why it helps.
    If you can reliably identify synthetic content, you can:

    • Exclude it from critical training sets.

    • Down-weight it where inclusion is acceptable.

    • Preserve clean, human-only corpora as strategic assets.

    Realism.
    Watermarking is an active research area; no scheme is fool-proof, and ecosystem-wide adoption is hard. But as part of a layered provenance stack, it is one of the most powerful levers we have.

    5.2 Fixed human anchor sets

    What it is.
    Maintain a non-replaceable core of human data in every training generation—e.g., 20–30% of the total data—and accumulate, rather than replace, previous generations’ real data (UCSD CSE, 2025; Winssolutions, 2025).Computer Science+1

    Why it helps.

    • Theory and experiments suggest that when you never discard your original human data, and you blend synthetic with real instead of replacing, collapse does not emerge—even with high synthetic ratios.

    • The anchor set acts as a permanent record of the tails. Models may compress, but they are continually forced to re-engage with rare events.

    5.3 Hybrid training and explicit tail up-weighting

    What it is.

    • Treat the real-to-synthetic ratio as a tunable parameter, not an accident.

    • Deliberately oversample tails and minority groups in training and evaluation (Winssolutions, 2025; UCLA Livescu Initiative, 2024).Bob Trower Blog+1

    Why it helps.

    • You convert “protecting the tails” into an explicit training objective, rather than hoping the optimizer will keep them.

    • In domains like medicine and safety-critical systems, you can define tail-sensitive metrics and hold the system accountable to them.

    5.4 Algorithmic innovation

    What it is.

    • Develop architectures and training procedures that are more robust to approximation error and mode collapse (arXiv, 2024; UCSD CSE, 2025).Computer Science+1

    Why it helps.

    • Better inductive biases and regularization can raise the bar: for any given amount of synthetic pollution, models might degrade more slowly.

    Limits.

    • Architecture alone does not fix the underlying problem of distribution drift when training on synthetic data.

    • Without data-level controls, you are still driving with your eyes slowly closing.


    6. Outlook

    Three points seem robust in 2025:

    1. Recursive unanchored training is dangerous.
      If you repeatedly train on synthetic data without preserving and up-weighting human tails, collapse is not a speculative edge case; it is a natural outcome.

    2. The winners will treat human data—and its provenance—as a core asset.
      Clean, rights-respecting pre-AI corpora, well-maintained anchor sets, and robust provenance infrastructure are becoming strategic resources, not afterthoughts (Alemohammad et al., 2023; Winssolutions, 2025).Bob Trower Blog+1

    3. In medicine, collapse is a safety and ethics problem, not just a performance bug.
      When the tails are rare diseases and under-represented demographics, model collapse is a mechanism for silent harm. Regulatory regimes will, and should, treat data provenance, human anchors, and HITL processes as mandatory.

    Model collapse is not a mystical apocalypse. It is an expected failure mode of naïve pipelines in a synthetic-flooded world. The good news is that we already know the basic counter-measures. The question is whether we take them seriously soon enough.


    References

    Alemohammad, S., Baraniuk, R., Hobbhahn, M., Kulkarni, A., & Martínez, F. (2023). Breaking MAD: Generative AI could break the internet [Article, Rice News].
    Plain text URL: https://news.rice.edu/news/2024/breaking-mad-generative-ai-could-break-internet

    ACL Anthology. (2024). An open-source data contamination report for large language models.
    Plain text URL: https://aclanthology.org/2024.findings-emnlp.30.pdf

    AAAI Publications. (2023). LatestEval: Addressing data contamination in language model evaluation through dynamic and time-sensitive test construction.
    Plain text URL: https://ojs.aaai.org/index.php/AAAI/article/view/29822/31427

    AICOMPetence. (2025). Model collapse: AI training on AI is breaking models.
    Plain text URL: https://aicompetence.org/model-collapse-ai-training-on-ai-breaking-models/

    FabledSky. (2025). Model collapse.
    Plain text URL: https://fabledsky.com/knowledge-base/model-collapse/

    Humans-in-the-Loop. (2025). Preventing model collapse in 2025 with human-in-the-loop annotation.
    Plain text URL: https://humansintheloop.org/what-is-model-collapse-and-why-its-a-2025-concern/

    IBM. (2024). What is model collapse? [IBM Think].
    Plain text URL: https://www.ibm.com/think/topics/model-collapse

    Mondo. (2025). AI model collapse: What it is, why it matters, and how to prevent it.
    Plain text URL: https://mondo.com/insights/ai-model-collapse-what-it-is-why-it-matters-and-how-to-prevent-it/

    NYU Center for Data Science. (2025). Tiny changes to training data can make medical AI models unsafe [Medium].
    Plain text URL: https://nyudatascience.medium.com/tiny-changes-to-training-data-can-make-medical-ai-models-unsafe-52fa6f154525

    UCSD CSE. (2025). Preventing model collapse in the synthetic-data era [Lecture slides].
    Plain text URL: https://cseweb.ucsd.edu/~yuxiangw/classes/AIsafety-2025Fall/Lectures/preventing_model_collapse_suraj.pdf

    UCLA Livescu Initiative on Neuro, Narrative, and AI. (2024). Model Autophagy Disorder.
    Plain text URL: https://livescu.ucla.edu/model-autophagy-disorder/

    Winssolutions. (2025). The AI model collapse risk is not solved in 2025.
    Plain text URL: https://www.winssolutions.org/ai-model-collapse-2025-recursive-training/

  • .

The Covenant ensures that the judiciary of any polity, while respecting its laws, can never use those laws to establish or protect systems of arbitrary domination.

Comments

Popular posts from this blog

AI is Climbing the Wall -- Fast

Core Rights Draft

Javascript webp to png converter