New Microsoft AI makes pictures speak

A new artificial intelligence tool from Microsoft is blurring the line between what’s real and what’s not. The technology, known as VASA-1, can take a still photo of someone’s face and turn it into a moving video of them talking or singing.

Microsoft says the lip movements are “precisely synchronized” with sounds, making it seem like the person is actually speaking or singing. For example, the famous painting “The Mona Lisa,” created by Leonardo da Vinci in the 16th century, can now recite lines in an American accent.

However, Microsoft is keeping this technology under wraps, acknowledging the risk of misuse, such as impersonating real people. VASA-1 takes a single image of a face, whether it’s a photograph of a person or a fictional character from a painting or artwork, and matches it with speech from any individual. This creates the illusion that the image is alive and talking.

The AI was trained on a database of facial expressions, allowing it to animate the face while it’s talking. In a blog post, Microsoft researchers describe VASA as a “framework for generating lifelike talking faces of virtual characters.” They claim it opens the door for real-time interactions with virtual avatars that mimic human conversation.

The researchers explain that their method not only synchronizes lip movements with audio but also captures a wide range of emotions, facial expressions, and natural head movements, adding to the sense of realism and liveliness.

VASA-1 could enable digital AI avatars to interact with people in a way that feels as natural as talking to a real human. However, there’s also a risk of fraud, where people could be fooled by fake messages that seem to come from someone they know or trust.

Security expert Jake Moore from ESET emphasizes the need for caution, saying, “Seeing is most definitely not believing anymore.” He suggests that as this technology advances, it’s critical to ensure that people understand its capabilities and think twice before trusting what they see or hear.

Microsoft maintains that VASA-1 is not intended for deceptive or misleading content, but they admit it could still be misused for impersonation. They are looking into ways to use this technology to improve forgery detection and are against any behavior that creates misleading or harmful content involving real people.

While acknowledging that current AI-generated videos still have noticeable flaws, Microsoft believes AI is rapidly advancing and is committed to ensuring these technologies are used responsibly.

LEAVE A REPLY

Please enter your comment!
Please enter your name here
Captcha verification failed!
CAPTCHA user score failed. Please contact us!