Frame Interpolation Using Both Classical Tools and Deep Neural Networks

Frame interpolation, the synthesis of new frames in between existing frames in an image sequence, has emerged as a key tool in motion picture effects ever since its large-scale use in the 1999 movie “The Matrix".

The film helped popularize an entire class of frame interpolation effects like BulletTime, Timeslice and SloMotion that collectively fall into the category of retiming. Retiming is also a key element in television standards conversion to upsample signal frame rates for display, as well as to convert between worldwide TV standards. In their paper, “Moving Image Frame Interpolation: Neural Networks and Classical Toolsets Compared,”https://ieeexplore.ieee.org/document/9424053 authors Kokarum, Singh and Robinson provide a detailed discussion of both paradigms, concluding that a hybrid of the two methodologies is the future of moving image frame interpolation.

Classical Motion-Based retiming takes a two-step approach. First bilateral motion information of the existing frames is used along with weighted frame averaging to estimate the motion of the in-between frame. Then, a mathematical calculation uses that motion to generate the picture by interpolating pixels from the existing frames. A forward prediction model is also required to account for objects which may be occluded or uncovered during the motion.

Deep Neural Networks (DNN) use a two-step, motion-then-picture approach as well. However, this new style of algorithms is based on the concept that optimal motion and interpolation can be learned from databases of sufficiently large video sequences. Then, depending on the algorithm used, the DNN may act as a post-processor to clean up artifacts, generate auxiliary motion or compensate for any loss of spatial or temporal smoothness.

The authors devised an experiment to compare eight state-of-the-art retimers, including toolsets using classical Motion-Based algorithms, various Neural Network algorithms, and hybrid models combining Motion-Based interpolation with Neural Network post-processing. A dataset of 140,000 frames of material served as their baseline, incorporating more than 100 separate clips of varying duration, all at 720p resolution and 240 frames per second. After downsampling these high frame rate signals to sequences of 30, 60 and 120 fps, they tested each retimer by upsampling the sequences back to the original frame rate. The Peak Signal to Noise Ratio between the retimed and the original frames served as a measure of their ability to generate good picture quality.

Unsurprisingly, they found that it becomes much easier to upsample as the original frame rate increases, since at high frame rates the motion is much smaller between frames. They were surprised, however, that all the Motion-Based and DNN techniques perform about the same at 30fps. The Motion-Based retimers outperformed most of the Neural Network algorithms in cases where the camera motion is large and there are textural regions, such as in cinema applications. The Neural Network methods excelled at coping with brightness fluctuations and fine details in motion, but at the expense of requiring massive computational resources. All techniques struggled with reconstructing frame edges, though the Motion-Based techniques came out ahead. Bottom line, the tests found that on average there is no statistical difference between the best Motion-Based retimers and the Neural Network techniques.

In predicting that hybrid schemes are the future of moving image frame interpolation, the authors note, “We find that techniques relying principally on Deep Neural Networks do not clearly outperform the classical ideas. It is only with the emergence of hybrid approaches since 2019 that we see DNNs adding significantly to the performance in this space. Despite the hype surrounding DNNs, we find that there is still something left to do.”

Read the complete article in the May issue of the SMPTE Motion Imaging Journalhttps://www.smpte.org/motion-imaging-journal

How Background Luminance Affects Perception

Sustainable Technology: Reducing Power Consumption of Projectors

Winter Intensive Boot Camp for IP Networking Professionals

How Background Luminance Affects Perception

Sustainable Technology: Reducing Power Consumption of Projectors

Winter Intensive Boot Camp for IP Networking Professionals

Frame Interpolation Using Both Classical Tools and Deep Neural Networks

SMPTE Content

Related Posts

How Background Luminance Affects Perception

Sustainable Technology: Reducing Power Consumption of Projectors

Winter Intensive Boot Camp for IP Networking Professionals

How Background Luminance Affects Perception

Sustainable Technology: Reducing Power Consumption of Projectors

Winter Intensive Boot Camp for IP Networking Professionals

Frame Interpolation Using Both Classical Tools and Deep Neural Networks

SMPTE Content

Related Posts

Creating the Cinema Look in HDR with a New Approach to Frame Rates

Making Sense of the Myriad Choices in Production CODECs

Video Compression Terminology Explained