A recent article on the Hollywood trade site, the Wrap, was headlined asking a provocative question: “AI and the Rise of the Machines: Is Hollywood about to be Overrun by Robots?” Another article in the Washington Post a couple of weeks later was headlined with a similar theme: “AI can make movies, edit actors, fake voices. Hollywood isn’t ready.”
Indeed, concern, confusion, speculation, and fear over what recent advances in the world of Generative AI technology might do to the Hollywood work force, established workflows, and the content creation sphere’s typical way of doing things generally have been common refrains lately in media coverage and industry conversations. Even the teaser headline for SMPTE’s webcast scheduled for late April on Generative AI in media –“Who will win, who will lose?”—reflected the uncertainty permeating the industry’s interest in this technology. And understandably so—things are changing, and at a faster pace and in ways originally unanticipated when the technology was first developed. This frenzy has been accelerated by the recent success, headlines, and interest generated by the ChatGPT generative AI model created by the OpenAI research and development company and now available for use in lots of industries, including the media world.
Yves Bergquist, however, prefers to take a longer, more sober view of the technology and its potential to help, hurt, influence or automate portions of the creative process. Bergquist is the Program Director for the AI & Blockchain in Media project at the USC Entertainment Technology Center (ETC), and also the co-chair of SMPTE’s and the ETC’s Joint Task Force on AI Standards. He was quoted in the Wrap article and was the speaker for the SMPTE AI webcast. He says the ETC has been examining the AI paradigm for about seven years now and agrees that the technology’s sudden leaps forward “will be disruptive” to the media industry but suggests that is not remarkably different than the rise of other new technologies, such as the Internet back in the day, the cloud, and so forth.
“There is a need for education and awareness, which is something the ETC and SMPTE and others are working on, especially since the technology is moving so fast right now,” Bergquist says. “There is an exciting amount of innovation going on in that space at the moment. But, until recently, it was more in non-creative areas like sorting and tagging metadata, content recommendation, and so forth. People have to understand what is happening now, how it is going further, and how it can help them develop content. But even with legitimate concerns, the idea that ‘the machines’ will be running everything is a kind of unfortunate science fiction, and we have to be really careful with taking that approach when we do commentary on AI in media. Commentators generally tend to buy into the hype because their business model is growing their audience. But then you have the builders and users of the technology. They want to use the technology to build stuff. And those people tend to be a lot more conservative, because they know how immensely difficult, expensive, and complex it will be to build the technology into products that will actually be used by a lot of people. And right now, despite the hype, we are nowhere near what you might call a ‘general intelligence capability’ of an AI application being able to do a wide variety of things. There isn’t even a consensus yet on what that could look like or how to build it.”
Also important is the need to eventually build “a regulatory framework” for new AI models. He says this is important to help mitigate the possibility of human biases being woven into such algorithms, and to ensure that AI tools will be properly developed and utilized to improve efficiency and creativity in the media space, rather than for misleading or nefarious purposes by those Bergquist calls “bad actors,” particularly in the social media realm.
“There is legitimate concern around encoding biases into artificial intelligence models, where inherent biases of our society might become part of [machine learning] training data and encoded and perpetuated through these models,” he relates. “There has been a lot of awareness and concern and debate around this issue. I expect to see a strict regulatory framework, probably coming out of the EU [European Union] first and then possibly here in the United States. So, to my mind, the next stage in this whole revolution is not necessarily a huge jump in the technology, but rather, a huge jump in the regulatory framework around the technology.
Bergquist emphasizes that this framework will need to emphasize how to use AI tools ethically in media production. He points out that the ETC, in partnership with SMPTE, published an AI ethics in media paper last year. Now, he fully expects “to publish [an updated ethics in media paper] this year, and probably we will have to review our ethics paper every six months or so, because all of this is going so fast right now.”
Indeed, in the long run, he believes the industry’s need to coalesce around a set of best practices and/or standards regarding ethical design, development, and deployment of AI tools in media will be just as important as more typical areas of standardization—interoperability, and so forth. These areas are among the reasons that the Joint Task Force on AI Standards was created in the first place. Bergquist stresses that “at a minimum, our industry needs to develop a joint mindset about rules for this new technology and having a central location or entity like our Task Force is very useful.”
“We need ethical standards or standards of disclosure and transparency around the use of artificial intelligence in media content,” he elaborates. “That is why we need to develop thought leadership in the industry and the ability to not only understand the technology, but also to spread education about it through the media community. Right now, for example, we are preparing two classes with SMPTE on artificial intelligence—one for media executives and one for creatives.”
Bergquist emphasizes that a lot of the ethical and legal questions surrounding the use of generative AI in content creation revolve around issues of copyright, contracts, consent to re-create someone’s likeness or voice, and so forth. Limited examples of bringing actors back to life on screen, or making them younger, or giving them back their original voice or characteristics, for example, already populate the entertainment landscape. Some of these things might seem “creepy” to some people, he adds, but nevertheless, the train has left the station. So, he suggests that what matters now is that such developments and capabilities are developed under an ethical lens.
“From a technical standpoint, a lot of those things are possible today with some degree of accuracy, and that capability will only get better, and quickly,” he says. “So, there are legal issues and ethical issues associated with that.
“There are people who will think it is creepy and we shouldn’t bring actors back to life or change their age, and so on, using AI tools, and then, there are those who think it is cool and that it opens up a lot of new creative avenues. Where we land is we want to raise awareness in the industry of the fact that, if something like that is done, it should be done ethically and with transparency and fairness to those actors, their heirs, or estates. We should make sure all stakeholders are in alignment, in other words, in terms of what is being done, how to do it, and how to compensate the stakeholders fairly.
Bergquist adds that, for now, ChatGPT is the best example of generative AI being widely used in a creative fashion, albeit in the text realm.
“ChatGPT does with text what other models do with visual data,” he explains. “Essentially, it is a large language model, which means it is a statistical machine learning model that was trained on an enormous corpus of English language text. What it is doing really well that is new is it is predicting which word is coming after a specific word. It keeps a lot of text in memory in order to generate future text. In other words, it is a lot more contextual than previous models. We’ve been able to generate text for years, but what ChatGPT does much better than other models is to be textually relevant and grammatically coherent—I wouldn’t say in the narrative sense yet, but certainly coherent across a fairly long string of text.”
He adds that, in the entertainment space, many creatives are already using ChatGPT to do things like generate dialogue for particular scenes, and to explore and experiment with dialogue and on-screen text. But, he adds, the industry is “very bullish” about looming visual models that would be somewhat similar conceptually in terms of architecture and be able to be generative in terms of outputting sophisticated visuals and actual video clips.
As many fear, Bergquist has little doubt that, eventually, some assistant or prep jobs will be impacted, as a lot of that work will probably become automated sooner or later with AI taking over many “micro decisions that it can accelerate and automate.”
“That will take some of the manual labor out of the process, but I don’t think most creatives would make the case that this type of manual labor [where assisting or prepping is concerned] is particularly crucial to the overall creative part of the process, so automating some of it might make sense,” he adds. “Will that impact jobs in the assistant world? Probably, and that is definitely something that we should think about as we make sure there is still a track for people coming up within the ranks to gain hands-on experience. But the point is, the craft element is sort of getting eaten by software in some of these cases, while the creative element will get more focus and need more involvement.
Also, he says the technology is already infiltrating things like casting, “where it can help you understand who the best cast member would be for a specific project, depending on the audience segment you are targeting,” and for marketing and distributing content.
In particular, he points to companies like OpenAI, Google, Microsoft, and others that are currently “pouring hundreds of millions of dollars into new AI models that can help automate certain aspects of a media workflow. So, what we are seeing now is really the output of years of billions of dollars in investment from those kinds of companies. This has already had a substantial impact on the media industry, and not only because of the tools they are creating for the industry. It’s also impactful for the potential diffusion of all these models into the general public, and the ability for the creator economy as a whole to upscale itself into high production value content that can compete with studio content.”
Other likely major beneficiaries of this industry change will likely be software companies and production or post-production facilities of all stripes, he says, who are a significant part of the aforementioned “creator economy.”
“Having the technology is one thing but being able to create products that are meaningful around it is another,” he says. “We are already seeing a lot of software providers like Adobe being really aggressive, and a lot of post houses also. We’ll see a lot of activity in and around the overall media and technical ecosystems, even if we do not yet know exactly how that activity will play out. This technology can lead to a tremendous amount of product innovation in time. Also, the visual effects community will likely be able to accelerate and optimize a lot of its workflows to bring down costs—being able to generate 3D models from 2D images more efficiently, for instance. So, it’s almost certain that the media industry will have to develop more software capabilities. Whether it is virtual production or artificial intelligence, these technologies require a lot more software knowledge than is traditionally available in the media industry. I expect we will need our own machine-learning experts [at media companies].
“However, it’s not entirely clear to me how far down the machine-learning rabbit hole Hollywood needs to go in the sense that anything meaningful they develop would require hundreds of millions of dollars and a specific set of experts and a specific culture. That isn’t really the media culture right now. So, the media industry needs to laser focus on adding this technology to its core competencies. And that is one part of what we are trying to do at the ETC, whether our members are service providers or software providers or from studios trying to understand what their role will be in this revolution. They need to figure out where to direct their focus with this technology.”
At least for the time being, he elaborates, that focus will likely be primarily on “using data and artificial intelligence to augment certain decisions or workflows. One of the things that AI will start allowing in the development process is the ability to take more data-driven risks. At ETC, we’ve been working on models showing where innovation in media content generates outsized returns. So, we hope to see the building of a set of technologies that are going to enhance creative intuitions of development executives to allow them to back new and different things.”
For additional insight, earlier this year, Bergquist authored on behalf of the ETC a comprehensive report on the state of generative AI in media, emphasizing that the industry has now entered a clear inflection point with this technology.