Rising Compression Options
Hot Button Discussion
by Michael Goldman
While compression might not be the sexiest topic, the entire world is growing increasingly dependent on the process daily now that video over IP concepts are surging to the front of the broadcast mindset. And that’s why, for those who pay attention to such things, it’s great news that “we are currently seeing more movement out of MPEG [the Moving Picture Experts Group] than we have in a very long time,” according to Russell Trafford-Jones, manager of support and services for UK-based video-over-IP specialist Techex and editor of the educational industry website called The Broadcast Knowledge. “In fact, we have many new compression standards just about ready to come forward. It’s a vibrant area. I’m seeing from my writing at The Broadcast Knowledge that posts about compression are among the most popular. That’s because, the truth is, everybody, needs to know about compression.”
To that end, Trafford-Jones highlights the major step forward taken recently in the form of the ongoing work standardizing the new compression protocol, JPEG-XS originally developed as TICO-XS. JPEG-XS is a new lightweight, low-latency video compression algorithm that offers lossless quality to users. Its arrival dangles significant latency improvements in front of those engaging in remote production, E-sports, and live events.
Trafford-Jones says JPEG-XS essentially builds on the advantages of the JPEG-2000 standard, which he describes as having “fit into a really nice niche where high quality was needed. It takes up quite a lot of bandwidth, but is quite manageable for medium to big broadcasters.”
JPEG-XS, however, “ups the game,” he suggests, particularly because of its low-latency potential.
“Compression is like one of those triangles where if you increase one thing, another thing decreases,” he says. “So typically, quality has to be played off against bandwidth, which also affects latency. If you want a great two-way, you need low latency—otherwise, the conversation is stilted. If you try pushing video down a small internet pipe, the quality or the latency is going to be affected. Now, a new codec still has to balance all these issues, but by using different technology, you have a one-off opportunity to improve one or more aspects without impacting the others. This is exactly what JPEG-XS does.
“That means it appears to be visually lossless, with a manageable bandwidth, like 150 Mb/s. It’s really lightweight, and we are talking in the order of microseconds of latency, instead of milliseconds.
“That’s a big deal because, while JPEG-2000 was great for two-ways, it wasn’t ideal for remote production. If you are making switching decisions, vision mixing a show, JPEG-XS has brought latency down to something a production can easily manage.”
Trafford-Jones expects this development, in turn, “to change the meaning of remote production,” moving beyond its usual role where it reduces the need for mobile production trucks and large on-site crews for live sporting events and concerts.
“Times are starting to change commercially in terms of how companies work,” he relates. “If you look at radio, the faders were separated long ago from the kit which could be thousands of miles away. Some of the bigger players in television are realizing it’s now practical to push all their infrastructure into a data center.”
“Since [the industry] is embracing [ST 2110, SMPTE’s professional media over IP networks protocol] and massive workflow-changing projects are being planned to take advantage of these new technologies, SMPTE is already working on an extension to ST 2110—ST 2110-22—which will, for the first time, allow compressed video essences into the so-far uncompressed ST 2110 suite as an alternative. This creates the opportunity needed to make your whole broadcast plant operate using remote production—not just your mobile trucks. With low-latency compression, even if the video has to be encoded and re-encoded many times, it will still be almost immediate. Because of all this, I think JPEG-XS will be a great enabler for companies to reduce costs and improve infrastructure management.”
Of course, JPEG-XS isn’t the only new development on the compression landscape. As Trafford-Jones referenced, MPEG is quite busy these days. The 16-year-old AVC (Advanced Video Coding or H.264) codec remains important for compressing and distributing video content, but more menu options for various applications will be popping up within the next year, he emphasizes.
These include the upcoming publication of codecs VVC/H.266 (Versatile Video Coding), the successor to the High-Efficiency Video Coding (HEVC)/H.265 standard; EVC (MPEG-5 Essential Video Coding), which has been designed with a more basic, royalty-free mode; and LC-EVC (Low Complexity EVC), which can be implemented on lower-end devices and uses so-called “enhancement layers” on top of existing bitstreams, meaning a single stream could be decoded as both UHD and HD in some cases. Trafford-Jones expects all three of them to be published before the end of 2020.
And then, where online streaming is concerned, he emphasizes that in addition to the existing MPEG-DASH protocol for streaming, the industry now has Apple’s Low-Latency HLS (HTTP Live Streaming) methodology and, soon, will add the AV1 royalty-free option, being developed by the Alliance of Open Media. “We will slowly see growing adoption of AV1 as encoders and decoders are optimized, and from late 2020, hardware implementations start to become available,” Trafford-Jones adds.
“[A wide variety of] new codecs are needed for a number of reasons,” he elaborates regarding this trend. “Fundamentally, the use of video for all kinds of applications is growing and growing in our society. Most of us are captured on video almost every day—think about security cameras and body cameras on people like police officers. So compression is not just about broadcasting. Codecs like AV1 were born to service media specifically, while others, like LC-EVC, were born to bring improved compression to simpler hardware. From the industry’s perspective, though, if you look at operating expenses for sending video at a Netflix, Twitch, or Facebook scale—Netflix spends over $35 million a month on streaming costs. So deploying a codec that could save them just 10 percent on bandwidth would be worth more than $40 million every year.
“The point is when you stream lots of videos generally, or a number of videos over and over again, it really pays to reduce file size. So big video companies are quite happy to throw massive amounts of Cloud computing power at the problem and develop [new codecs] to make files even a few percent smaller if they can.”
The biggest change to streaming may well be the loss of ubiquitous interoperability, he adds.
“We are moving into a world that is going to be more of a 20-20-20-20 split in terms of what the [major compression codecs] are, and the rest made up by a few others. We live in a world now where nearly everything supports AVC/MPEG-4. This level of interoperability won’t be seen again. At a certain level, AVC will certainly stick around; HEVC will still be around, as well, but then, AV1 will come in and take up room, and VVC will make an impact also. So we won’t be relying on one or two codecs anymore, particularly in the broadcast space.”
Still, all these rapid changes and the need for multiple options have the potential to create chaos in the landscape. Chief among potential complications is whether streaming services and broadcasters can afford to use the various options available to them, to help them know what royalties they need to pay, how much, when, where, and to whom. Patent Pools were created to address such matters, of course. When successful, a group of companies cross-licensing their patents for their portions of the same technology can provide users with a single place to license technology and provide predictable costs.
After all, as Trafford-Jones explains, “every codec is a toolkit. There might be 40, 50, or 60 tools inside it—bits of code you can turn on or off depending on how much complexity your hardware can deal with. But broadcasters need the ability to maximize complexity so that they can get every single benefit out of every byte that they send. What happens is that different companies come up with different tools based around different ideas; they offer these to be included, and the standard then becomes a document that says ‘these are all the different things we can do.' The problem is, the broadcaster won’t necessarily know which bits of the standard are licensable techniques, and which aren’t. Companies are adverse to risk, meaning this isn’t a comfortable place to be. So when a patent pool works well, it effectively says, ‘if you license with us, then you won’t have any legal or financial surprises.’ ”
However, the concept of the patent pool doesn’t always succeed efficiently on this new terrain, Trafford-Jones says. Indeed, complexities regarding licensing issues were a big reason why alternatives to HEVC, for example, were initially pursued in the first place.
“About only one-third of HEVC-related patents are in a pool, and there are also three pools just for HEVC right now,” he explains. “So even if you have licensed your use of HEVC with the three pools, you have to worry that there are plenty of opportunities for other companies to approach you and stake their claim on your profiting from their great ideas. For HEVC, there are 993 relevant patent families. It’s an almost intractable problem to expect each company that wants to use HEVC at scale to deal with. Apple does have HEVC extensively, but companies like ESPN, for example, have said no to the codec.”
Thus, he adds, EVC was developed to include a simplified toolkit that is royalty-free. The baseline profile contains technologies over 20 years old, Trafford-Jones says, submitted with a royalty-free declaration.
“And then, there is the main profile, which has better compression and licensable patents,” he elaborates. “This is MPEG directly and purposefully responding to these problems. The separate codec—LC-EVC—is understood to have only one patent holder, the company V-Nova.”
Even as these issues are being sorted out, however, the technological development of new and better approaches to compressing, encoding, and decoding video data marches forward, Trafford-Jones says. In particular, he points to industry work to bring machine learning and artificial intelligence technologies into the fray.
“There is a start-up company in the UK called Deep Render, whose codec is completely based on neural networks,” he says. “They are using [artificial intelligence] to try and emulate the way that the eye sees things. That is potentially important because typically, what codecs don’t do is look at the entire image. They split it up and make their best guess at how to encode a particular part of an image, and then they move on, rather than analyzing the image as a whole. You can derive a much better image if you look at it as a whole to understand where you do, or do not, need as much detail. People have been working on this with other codecs with so-called ‘region of interest’ features, but this company’s approach makes this a central aspect of the technology. So that is a pretty new thing, bringing AI to the forefront.
“Eventually, I expect we will see AI and machine learning seeping into all codecs. But if this start-up can succeed in creating a new, full codec using this approach, it would make data rates drastically lower. And really, what you need to get people interested in adopting a new codec is jaw-dropping savings in terms of bandwidth or blasting away at latency, as JPEG-XS is doing. That’s when the magic happens.”
A recent report in the New York Times details efforts currently under way by major internet companies to create technologies to help them identify so-called “deepfake” videos. Such videos are created using artificial intelligence (AI) tools to attempt to digitally alter real people and doctor images for nefarious purposes, and industry experts are increasingly worried they will be used to spread disinformation in advance of the 2020 election. Google, Facebook, and others are researching methods for finding and identifying such videos on their platforms. The article says AI tools are “streamlining” the ability, time, and cost necessary to doctor videos, and so tech companies are responding by using AI systems of their own to track down such fakes. A key to their approach involves using AI technology to build video libraries of deepfake videos, and then index the methods used to alter the images, allowing systems to examine videos on their platforms for such methods and alterations.
5G Plan vs. Weather Forecasting
A recent Washington Post report offers concerns that a recent international deal on what process to use to roll out 5G technology globally could pose risks to weather forecast accuracy. The report says a deal announced in Egypt in late November regarding a plan to roll out 5G that operates on specific radio frequency bands was of concern to multiple federal agencies and the World Meteorological Organization. Their concerns revolve around the possibility that 5G equipment operating in the 24-gigahertz frequency band could potentially interfere with signals coming from polar-orbiting satellites that gather weather data, making the data less reliable, because the 5G equipment could allow signals to “bleed into” frequencies operated by the National Oceanic and Atmospheric Administration (NOAA) and NASA. This concern is often referred to as “out-of-band emission limits.” The article adds that various agencies are looking into potential solutions, such as the use of artificial intelligence to recover lost or corrupted data, and points out the FCC has not yet decided how it will incorporate the Egypt agreement into its new requirements for recent spectrum buyers.