SberAI presented the Russian version of the ruDALL-E text description image generator — Trends figured out what could come of it
At the beginning of 2021, Elon Musk’s company OpenAI introduced the DALL-E software – developers were able to train a neural network to create images from short text captions. The program, whose name has clear references to the surrealist artist Salvador Dali and the character Wall-E from the cartoon of the same name, is based on the GPT-3 (Generative Pre-Training) text generator, which the company introduced in 2020.
Since the release of DALL-E, various countries have become interested in this development – for example, in China, a similar generator appeared under the name CogView. Finally, it became possible to generate images from texts in Russian — in November 2021, the teams of SberAI, SberDevices, Samara University, AIRI and SberCloud presented the ruDALL-E project.
It is reported that training the ruDALL-E neural network has become the largest computational task in our country. Currently, some generator models are already available in open source – ruDALL-E Malevich (XL), Sber VQ-GAN, ruCLIP Small and Super Resolution (Real ESRGAN).
According to the developers, image generation solves two important tasks that the search engine cannot solve – firstly, it allows you to take into account the exact description of what you want, and secondly, the program creates unique images that did not exist before. They can be used for photo illustrations of articles, in copywriting and in advertising. The Trend team tested the neural network by feeding it several descriptions: