The world of filmmaking is no stranger to technology innovations that continue to bring some of the most breathtaking, enchanting, exciting and (sometimes) not-so-thrilling experiences for movie lovers. However, the recent entry of generative AI in filmmaking has been making rounds in the news for its so-called newness and improvement in its technological infrastructure by leaps and bounds.
Most of the credit for this technology enhancement goes to OpenAI’s new text-to-video model, Sora, that can generate video scenes in highly detailed quality by feeding it simple text instructions. Generally, there is quite a plethora of technology wonders that we have witnessed - from Elon Musk’s Neuralink and drones to Spaceship One, WiFi and even GPS navigation. Sora could arguably soon make it to that list with its continued progress.
It is currently capable of the following:
So far, multiple videos have been shared by the team at Sora showing the model’s capabilities. In pure tribute to the Internet’s long love for cute animal videos, one video shared by Sora shows an adorable dalmatian in the colorful streets of Burano in Italy. Another video shows a stylish woman walking through a district in Tokyo. There are lots more, indicating the endless possibilities of using generative AI in filmmaking.
Having said that, it is important to note that whilst this technology model is improving at a relatively rapid pace, it is yet to become fully safe for mass use across the world. Concerns regarding the accurate simulation of the physics of a complex scene exist, wherein the model may not understand specific instances of cause and effect. For example, Sora notes that a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark. Furthermore, as per Sora, the model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.
Moreover, there are concerns surrounding the representation of diverse characters through generative AI. There is always a possibility of creating something offensive through such models. The team promises to engage policymakers, activists and social experts around the world to understand their concerns regarding diversity, misinformation, hateful content, bias, illegality of certain types of content and more.
It is also critical to reflect on the World Economic Forum’s report from earlier this year in advance of its Annual Meeting in Davos, that listed disinformation as a top global risk. Given widespread distrust in governments and news media, manipulated content threatens to make existing tensions significantly worse. The report found that adverse outcomes of AI technologies have the potential to be one of the most risky problems in the short run.
Looking to the future, AI in filmmaking has promising prospects. From streamlining operations and reducing production costs to creating imagery with stunning visual effects, there are exciting opportunities that it brings, which go beyond visuals and may also open new avenues for fresh job opportunities. Risk prevention towards the spread of misinformation, hateful content and weak language models remains important to ensure a smooth transition for all those interested in generative AI-powered filmmaking.