Deep Dream, a captivating tool developed by Google, offers a unique way to transform images into surreal, dream-like creations. The outcomes can range from amusing to eerie, psychedelic to peculiar, often featuring an abundance of eyes and dogs. This photo of Budapest bridges past and future, evoking the history of cafes while envisioning a verdant city powered by wind turbines.
Let's delve briefly into how Deep Dream operates. It falls within the realm of deep learning, an increasingly popular field known for its remarkable ability to learn, particularly in image, video, audio, and natural language processing domains. This capability enables machines to identify objects and individuals in photos. For example, Clarifai's smart object recognizer can analyze uploaded photos and provide descriptive labels. Similarly, Hound's speech recognition demo showcases the potential of deep learning. At the heart of Deep Dream lies a deep neural network (DNN), inspired by the intricate structure of the human brain. With billions of neurons organized into layers, a human brain processes information and learns. The connections between neurons play a crucial role in how signals propagate through the network.
In an artificial neural network like a DNN, neurons are also arranged in layers, interconnected as depicted in the image above. Neurons in the input layer process the initial input, such as the image fed into Deep Dream. They either activate or remain dormant based on the visual input, and this information is transmitted to subsequent layers. Ultimately, neurons in the output layer make determinations about the input, such as identifying objects in an image, like discerning whether a car is present. DNNs offer a significant advantage over conventional image classification methods as they can autonomously learn what features to focus on when identifying an object in an image.
Deep Dream employs a pre-trained deep neural network that has been exposed to a vast dataset of images. Having been trained on numerous image-text pairs, it can now identify objects in a picture. However, Google's engineers added a unique twist. After recognizing certain concepts and objects, Deep Dream also alters the image to enhance its resemblance to the identified objects. This modified image is then fed back into the network, initiating the recognition-transformation cycle once again.
So, if Deep Dream interprets something in a photo as resembling a bird, it tweaks small details to make it more bird-like. Deep Dream employs a flexible interpretation of "resemblance." Therefore, even if your nose is petite and elegant, Deep Dream might draw an unexpected parallel, likening your face to a bird's and subtly adjusting your nose to resemble a beak. As the process repeats, your bird-like resemblance intensifies, potentially leading to a gradual transformation into a bird over multiple iterations.
This explains why typical Deep Dream images often feature eyes, dogs, and peculiar creatures. Remember that the DNN has been trained on a specific dataset, ImageNet, which includes a plethora of animal images, particularly dogs. Consequently, the DNN excels at recognizing dogs and tends to perceive them in various contexts. To a certain extent, when all you have is a hammer, everything looks like a nail.
If you're interested in creating Deep Dream images, Google has made their code open-source. Several Dreaming-as-a-Service (DAAS) platforms allow users to upload their photos and receive the dream-like results. These include dreamscopeapp.com, deepdream.in, and deepdreamit.com.
For those with some Python knowledge, Google provides an IPython notebook to run Deep Dream on a local computer. This requires certain Python packages and the Caffe deep learning framework. However, for a hassle-free experience, a pre-configured virtual environment is available, courtesy of the community.
Exploring Deep Dream with the notebook provides the opportunity to experiment with other DNNs trained on different datasets, which can be found in the Model Zoo. This is particularly valuable if you've had your fill of dogs. For instance, the image of Budapest, along with others, was generated using a model trained on the MIT Places image set, designed for recognizing scenes and buildings.
In summary, Deep Dream offers a fascinating glimpse into the potential of robust image-generation tools for artists. It has even found applications in anomaly detection, as demonstrated in a humorous use-case involving androids from the future infiltrating a company's team. This lighthearted example underscores the adaptability of Deep Dream for unconventional purposes.