A recent issue of the SMPTE Motion Imaging Journal, presents paper entitled Toward Generalized Psychovisual Preprocessing for Video Encoding by Yiannis Andreopoulos, Mohammad Ashraful Anam, Aaron Chadha, Ilya Fadeev, and Matthias Treder.
The research explores the use of deep perceptual preprocessing to enable bitrate savings across several generations of video encoders, without breaking standards or requiring any changes in client devices. It lays the foundations toward a generalized psychovisual preprocessing framework for video encoding.
iSIZE has shown promising results using state-of-the-art AVC, HEVC and VVC encoders, delivering average bitrate (BD-rate) gains of 11% to 17% with respect to three state-of-the-art reference-based quality metrics (Netflix VMAF, SSIM and Apple AVQT), as well as the recently proposed non-reference ITU-T Rec. P.1204 metric.
For a deep dive on the topic you can read the complete paper here.
Additionally, Yiannis presented these findings at the SMPTE Annual Technical Conference (ATC), which you can learn more about here.