It’s an interesting idea, and kind of amazing that it actually works (and works well). Generative Adversarial Networks (GANs) are, as the name implies, a generative form of Machine Learning that attempts to generate content such as images, based on nothing more than lots of labeled examples of that data.
With GANs, two models evolve in competition with each other: the discriminator network is trained on real images (from a source dataset) and fake images (from a generator network). It is scored on how well it can distinguish real from fake. The generator network, meanwhile, starts out knowing nothing at all about what it should be producing. It learns when its output is fed into the discriminator, which decides how likely it is to be real or fake (or how likely it is or isn’t to be each of a family of possible objects it has been trained on, for instance.)
With a PyTorch model set to record the generator’s output for each generation (including the static-like output of the initial, completely untrained generator model), the model can be seen to learn the images, epoch by epoch. Each frame (epoch) after the first image represents the output of the updated generator network when prompted to produce the digits 0 through 9.
To show the learning process clearly, the learning rate has been decreased by a factor of 20 from the original (quite effective) setting of 0.002.
When training, each epoch (each frame of the movie) takes about 15 seconds to train, on a single GeForce RTX2060 GPU (host PC is a Core i9/9900 with 128GB running Windows 10.) If you view the individual movie frames at one image per second, you’re essentially watching them at the same rate that the network could train, if the learning rate were not artificially slowed.
What does this mean, in practice? Machines can now learn from data rather than being told specifically what is what — sometimes, even if that data isn’t labeled. They can learn categories on their own, too.
And oh, by the way, GPT4 wrote 95% of this code, from just a few prompts. It took a little working with it to get the images right, but my role was essentially just that of test engineer, copying code that GPT4 provided, running it, and reporting back with the results. That’s easy to automate! No doubt, we will soon see coding AIs that do just that. (Some of them can already execute Python code.)
So my computer has taught itself (marginally) better handwriting than mine.
That was bound to happen at some time, but 2024 looks like it will be a fun year.
I love living in the future.