Here comes Dream Machine that promises text-realistic AI videos. Our test

The release of the new text-to-video artificial intelligence is part of a burgeoning landscape in the field of generative artificial intelligence

14 June 2024

4' min read

Californian start-up Luma AI has unveiled Dream Machine, a visual storytelling tool that uses artificial intelligence to create videos that are made from textual prompts, thus allowing the content and style of the video to be defined. In short, realistic videos from simple textual descriptions (text to video), as already done by more renowned competitors such as OpenAI's Sora, Stable Diffusion and Sintesi.

As with competitors, the content generated by the artificial intelligence is immediate and can start from any description, for example, 'show me a young girl walking in the rain in Times Square'. The system, thanks to the instructions entered in the prompt, generates in a few minutes a five-second video that faithfully reproduces the described scene. In addition, Dream Machine also allows you to upload images to be animated, always accompanied by a prompt, thus allowing for extreme customisation.

The big news about Luma AI, which differentiates it from its competitors, is that it can be used for free, simply by registering on the Luma AI website. Of course, without a subscription there are limits, which currently lie in the ability to generate 30 videos in a month and the presence of a watermark. If one wishes to have more freedom, one can decide to join the paid plans, starting with the $29.99 Standard plan, which will allow one to generate up to 120 videos in a month. Videos can be downloaded to the PC or even to smartphones.

Our test

It was easy to put Dream machine to the test. We made the first attempt starting with a photo, specifically Silvia Gottardi of Cicliste Per Caso cycling along a road in Cilento.

COURTESY Oscar del Cicloturismo

The result generated by Dream Machine is smooth, the shot of the scene has good kinematics with the country behind it realistic. Too bad that the cyclist's movements are rather contrived, along with the bike appearing without some frame details. The result is not the best and the video can certainly be improved.

Silvia Gottardi di Cicliste Per Caso che percorre in bicicletta una strada in Cilento.

Not content with a first attempt, starting with this black and white photo of two crows on a branch, we wanted to give Dream Machina a second chance:

Specifically, we asked him: 'animate the image by making the crow below fly over the branch of the crow above, which flies away in the meantime'. Here is the result:

“Anima l’immagine facendo volare il corvo di sotto, sul ramo del corvo di sopra, che nel frattempo vola via

At first glance, the result looks more convincing than the first video, but the scene then turns out to be different from the one described in the instruction prompt (animate the image by flying the raven below, on the branch of the raven above, which meanwhile flies away). Moreover, the third crow that bursts onto the scene has a decidedly disproportionate wingspan.

Finally, for the sake of completeness, we asked Dream Machine to generate a video from scratch from a textual description. Specifically, we wanted to refer to a video 'of an Asian girl with short hair, walking in the rain in Times Square New York'. We added that we wanted her to wear a blue dress and that she was wearing headphones. 'The scene is nocturnal and the girl is NOT wearing an umbrella'. The Dream Machine artificial intelligence processing took less than the other videos above, still in the order of a couple of minutes.

Ragazza asiatica con i capelli corti, che cammina sotto la pioggia a Times Square New York

The kinematics of the scene are tasteful. Starting from a generic description, Ai has chosen to film the girl from three quarters, from behind and moving her head, showing us her profile. Too bad the result appears decidedly inconsistent with the prompt and rather contrived. He is carrying an umbrella (and we told him he didn't have to carry it), his hand appears abnormally turned, and the scene has some decidedly conspicuous hallucinations, such as the car passing by with the umbrella on it. As with photos, hands also represent the biggest challenge for the artificial intelligence that generates videos, as its representation is not always in line with expectations.

The end result appears decidedly disappointing. These are obviously first attempts. By refining the prompt with more detailed descriptions, Dream Machine will certainly be able to generate more realistic videos. It is a pity, however, that the system does not give the option of refining videos that have already been generated. If you do not like a video, you have to start from scratch with a new prompt.

Sector in turmoil

There is no doubt that the launch of Dream Machine is another brick placed in a market in which start-ups and tech giants compete to develop increasingly sophisticated tools for synthesising realistic images, audio and video from simple textual inputs, aware that the potential of this technology is destined to revolutionise not only the audio-video entertainment sector. Also on the table are important ethical and legal issues related to the creation of Ai-generated video content, as well as potential misuse and abuse, such as the creation of deepfakes.