Misconceptions about AI
- Matthew
- Oct 21, 2025
- 3 min read
As mentioned in the title, there's a lot of misconceptions about AI and its use in image and video generation. AI systems aren't just like some massive hard drive with a bunch of files sitting in it that the UI just references like a search engine, then cuts and pastes parts to make the final output. Here's a fairly simplified explanation as to how AI is trained. The training data is introduced into the base program. The program then looks at that training data, analyzing it for patterns and common features.
Let's use an apple, as an example. The AI is given 10,000 images of an apple. Some are photos, some are drawings, some are paintings, some are 3d models, etc. It looks at all of those images, looking for patterns and commonalities associated with that particular keyword of "APPLE". Shape, color, relative size, and so on. That process repeats millions of times. Once it's been trained on that data, its parameters (aka its memories) for what the word "apple" means are set. It has learned what an apple looks like. Once the training is complete and those parameters are set, the training media (all of the source images) are discarded/deleted in order to save space. It already knows what an apple is. That data is stored in its parameters. So if someone says "Make me a photograph of an apple", it will go into its parameters (essentially its memory of what an apple looks like) and begin to create an entirely unique image of an apple that fits the parameters on which it was trained.

It's essentially no different than showing a person 10,000 images of apples and then eventually asking them to draw an apple. It functions the same way, it's just that the AI system can do it a lot faster than a human can. And if you ask it do do it again, it will create a different apple. Each time, the apple will be different unless the prompt becomes so complex and detailed that it narrows down the look of the apple to meet very specific criteria. It will still have some unique elements.

That's actually an issue that they're currently working on in regard to things like video. AI creates video through the generation of individual frames, and without the proper training for consistency, each frame will be different. The first generated videos had that problem. Just look at the history of "Will Smith eating spaghetti" (it's been a weird metric used for each progressive version of AI video generation for some reason).
So, contrary to popular belief,
the AI is creating something entirely unique. The output is based on the detail and the subject of the prompts, along with the data on which it was trained. If you don't train it on what Iron Man looks like and a prompt asks for it to make an image of Iron Man, it will result in gibberish or the closest thing it can comprehend (like an image of a man who looks like he's made out of iron).


vs
AI is improving at an astronomical rate because of the constant tweaks and modifications to its base code and the larger and larger training data sources (We're talking petabytes of data. That's a million gigabytes per petabyte). But that source data doesn't stay with the AI once it's implemented. So it literally can't copy and paste from its source data. It can only use the complex mathematical parameters (aka its memory) for the meaning behind a keyword to create something new.
Again, it's a bit simplified, but that's basically how generative AI actually works.

Comments