Google Gemini launches Photo-to-Video: enthusiastic reviews and early reflections on creativity and deepfake risks

Graziano Stefanelli
Jul 10
3 min read

Google’s official announcement: animated photo-to-video arrives for all Pro and Ultra users

On July 10, 2025, Google unveiled on its official blog the new Photo-to-Video feature integrated into Gemini, enabling the transformation of any static image into a short, eight-second animated video with automatically generated audio by artificial intelligence. This innovation, which stands as one of the most ambitious advances in consumer AI video generation, is immediately available via the web (gemini.google.com) for Pro and Ultra users (Gemini Advanced), and will be rolled out “within the week” on Android and iOS mobile apps in the more than 150 countries where the service is already active.

How the new feature works: upload, text prompt, and 720p video in seconds

Using the feature is simple and immediate: just upload a photo to the platform, write a brief description of the desired animation or movements, and in seconds, Gemini generates an 8-second 720p MP4 video complete with ambient audio, background music, or even dialogues, all created via AI. Each video includes a visible watermark and an invisible SynthID tag for automatic recognition. The underlying engine is the Veo 3 model, already recognized for its controlled video generation capabilities. The feature comes at no extra cost for current Pro or Ultra subscribers.

The first reviews: limitless creativity but a need for deepfake safeguards

Within hours of the launch, there was a flood of reviews from international tech outlets and digital influencers. Axios, Lifewire, and The Verge praised the speed and quality of the generated clips, emphasizing how the new function opens unprecedented scenarios for animating historical photos, archival material, and even fantasy scenes. The Washington Post and TechRadar delved into ethical and safety aspects: while video generation is incredibly realistic and immediate, concerns arise over copyright, misuse, and deepfake risks, especially given the ability to insert realistic faces or voices into entirely invented contexts.

Historical precedents: how photo-to-video was tackled before Gemini and what makes Google’s approach unique

Before Google Gemini’s announcement, the dream of turning static photographs into short video animations had already inspired several AI projects, although none offered the same immediacy, quality, and large-scale integration. Among the first tools was Deep Nostalgia by MyHeritage, which animated faces in old family photos, generating short sequences of eye movement, smiles, and subtle head turns; it went viral in 2021, but was limited to faces and predefined short animations. Other platforms like D-ID and Avatarify focused on animated avatars and facial synthesis, mostly for social uses or meme creation, but without offering deep customization or dynamic audio. Some Stable Diffusion-based tools and minor startups experimented with generating short motion loops from images, but with technical barriers, missing watermarks, and limited textual control. Google’s approach with Gemini stands out for its immediacy, video/audio quality, cloud integration, and use of visible watermarks and SynthID—overcoming many limitations of previous pioneers and paving the way for mainstream adoption of this technology.

Social impact and creative potential: the AI frontier between innovation, risk, and new visual languages

The availability of a feature like Photo-to-Video marks a significant shift in public perception of the creative capabilities of artificial intelligence. On one hand, it opens up scenarios previously unthinkable for those working in digital communication, archiving, personal storytelling, or viral content creation. The ability to animate historical or family photos, recreate moments of fantasy, illustrate complex concepts for educational purposes, or produce short promotional videos in seconds revolutionizes audiovisual imagination for millions. On the other hand, the arrival of such a powerful technology in the hands of non-experts increases the responsibility of all actors involved—from Google to end users—in managing risks related to fake news, visual disinformation, and privacy protection. The challenge in the coming months will therefore be to balance creativity and control, allowing Gemini’s Photo-to-Video to become a tool for innovation and not another risk factor in the global digital sphere.

______

DATA STUDIOS

datastudios.org