Training AI on “synthetic” data generated from other AIs sounds genius! Seems like a bulletproof way to make AI infinity smarter just by recuressively feeding itself! Great success is on the horizon!
It’s been proven that even small amounts of synthetic data injected into a training set quickly leads to a phenomenon termed “model collapse”, though I prefer the term “Hapsburg AI” (not mine).
Basically, this is the kind of thing you announce you’re doing because it will hopefully get you one more round of investment funding while Sam Altman finishes working out how to fake his death.
That’s not how synthetic data generation generally works. It uses AI to process data sources, generating well-formed training data based on existing data that’s not so useful directly. Not to generate it entirely from its own imagination.
The comments assuming otherwise are ironic because it’s misinformation that people keep telling each other.
I like to call it, saving the red jpeg. One more save will make it better surely.
Like photocopying a copy over and over again
deleted by creator