What we know intuitively, and what the software shows, is that the pixelated image can be expanded into a cone of multiple probable outcomes. The same pixelated face can yield millions of various, uncanny permutations. These mathematical permutations of our human flesh exit in an area which is called “latent space”. The way to pick one out of the many is called “gradient descent”.
Imagine you are standing in an open field, and see many beautiful hills nearby. Or alternately, imagine you are standing on a hill, looking across the rolling valleys. You decide to pick one of these valleys, based on how popular or how close it is. This is gradient descent, and the valley is the generated face. Which way would you go?