Brad Pitt versus Tom Cruise: the duel of the century never happened. But he fooled us all
The (fake) video that makes Hollywood tremble created by a Chinese artificial intelligence with two lines of text
Fifteen seconds. That's enough to rewrite film history. Brad Pitt and Tom Cruise for the first time in hand-to-hand combat on a rooftop of a ruined city. A photorealistic action scene, with background noise and a discussion - provocative and surreal - on the Jeffrey Epstein case ('You killed Jeffrey Epstein, you animal! He was a good man!" says Pitt. 'He knew too much about our Russia operations. He had to die, and now you die too', Cruise's reply).
It was published by Irish director Ruairi Robinson, who was nominated for an Oscar in 2002 for a short film. The video, of course, is blatantly fake: generated by a two-line prompt inserted into the new generative AI Seedance 2.0. "This was a 2 line prompt in Seedance 2," Robinson clarified on X.
The video is decidedly disruptive in its photorealism. The scene is shot with the stylistic mastery of a seasoned director: dynamic angles, cinematic lighting, fluid camera movements. Rhett Reese, screenwriter of Deadpool & Wolverine, admitted he was 'shocked' by the quality of the Pitt-Cruise video. And the point is not only technical-stylistic: it is the feeling that the threshold between professional production and algorithmic simulation is rapidly thinning, with obvious implications for the entire film industry.
What is Seedance 2.0 and why is it scary
Seedance 2.0 is the new video generation model from ByteDance, the Chinese company that owns TikTok. The company described it as a 'substantial leap in generative quality' over the previous version. The model is currently available to Chinese users through the Jimeng AI app, but will soon be integrated into CapCut, the popular video editor used by TikTok users worldwide.
Seedance 2.0 is built on a dual-branch diffusion transformer architecture. In previous models, there was often 'drift' between shots, with perceptible variations in face geometry or lighting. Seedance's greater temporal consistency also makes frame-by-frame comparison, one of the traditional techniques for detecting deepfakes, less effective. The model also generates contextually appropriate sound effects and ambient audio, not just dialogue.

