by Sergio Grce, iSIZE Technologies
Data Economy, 26 June, 2020
Around the world, the advice surrounding the Covid-19 pandemic is that people should, wherever possible, work from home. Thanks to the ubiquity of high-speed broadband, that seems on the face of it a reasonable request for most people.
It seems reasonable because our use of the internet has grown far beyond emails and messaging. Even before the current crisis, video conferencing was rapidly replacing face-to-face meetings as a way of saving time and reducing carbon footprint.
In today’s unusual circumstances, online meetings are very popular. But it is not just workers who are confined to the home. Schools are also closed, so children are turning to the internet for educational material. Online fitness classes are booming. And, of course, people of all ages confined to the home are turning to streaming video services and online gaming for entertainment.
Before the crisis erupted, Cisco predicted that global internet traffic would reach 4.8 zetabytes a year (that’s 48 followed by 20 zeros). The significant point, though, is that video – in all its forms – represents at least 80% of that total.
And if we are consuming more video over the internet, whether conferencing or binge watching, then the impact on the network will increase. Openreach, which provides most of the broadband infrastructure in the UK, has seen traffic increases of between 35 and 60% over equivalent times in “normal” weeks. Vodafone says mobile broadband demand has increased by 50%.
EU Commissioner Thierry Breton commented recently that he is concerned that the digital infrastructure could collapse at any time. “Streaming platforms, telecom operators and users, we all have a joint responsibility to take steps to ensure the smooth functioning of the internet during the battle against Coronavirus,” he said.
The nature of digital video is that it is fiercely demanding of bandwidth. The native data rate for HD television is 1.5 gigabits a second; Ultra HD is at least four times that. It is also deterministic: if your television display does not get a new picture every 40 milliseconds you see it all too clearly. Drops and freezes may be acceptable in complex video conferences, but not when you’re watching Narcos or Stranger Things.
Video streams are compressed heavily before passing through either the broadcast or broadband pipe to get to you: a premium HD channel might be four or five megabits a second to your home. That is extremely effective, but the sheer mass of traffic still makes it a challenge for the internet infrastructure.
The codecs used to encode the video signals are, of course, tightly standardized – they have to be for the whole thing to work. Any changes to these standards take years to develop and ratify. Updating the codecs to reduce bandwidth requirements is not an option.
There is the suggestion that users might accept standard definition video streams rather than HD or Ultra HD. Netflix chairman and CEO Reed Hastings tweeted “To secure internet access for all, let’s #SwitchtoStandard definition when HD is not necessary”.
That, though, must be seen as a huge commercial risk. First, consumers have got used to seeing HD quality on the large screens now found in every living room. SD will be seen as very inferior. Longer term, it deeply undermines the push for Ultra HD which the streaming businesses have advocated as giving them a clear advantage over broadcasters.
It also does nothing to limit the boom in video conferencing. Most users have no idea of the resolution of the camera built into their computers, let alone how to modify its parameters. Video conferencing will continue to grab all the bandwidth available because there is no practical means of throttling it.
With a need to maintain perceived quality and no significant reductions in bandwidth from codec developments in sight, the only solution is to pre-process the video before it reaches the encoder. Perceptually optimized video files when given to the encoder should result in smaller streams out.
For a moment, let us look back 25 years. The challenge then was to stream audio, which required a sustained data rate of 1 – 2 Mb/s. The engineers and mathematicians developing the first MPEG standard applied psychoacoustics to the understanding of human perception of sound. Although purists remain critical, the level 3 audio coding within MPEG-1 – known as MP3 for short – has become universally accepted and used.
What MP3 does is eliminate those parts of the audio signal which most listeners would not miss. It allows the data rate to be slashed down to 64kb/s.
Our work at iSIZE Technologies has found that the same principles can be applied to video. If you determine what people actually see, then you can remove from the video stream those pixels which are not important. This is a pre-processing stage, prior to encoding into one of the video delivery standards, but with less data in so less data out.
There is extensive academic work on measuring the effectiveness of such video pre-processing. The most widely used is VMAF, for video multi-method assessment fusion. Driven by Netflix – which has a big interest in efficient video streaming – VMAF was developed by the University of Southern California and the University of Texas.
Through VMAF we have reliable metrics for human visual perception, and therefore a solid foundation on which to develop machine-learning processes to identify the less important parts of the image and to reduce their significance in the video flow. We are already seeing bitrate reductions of between 20 and 40% and no compromise to the visual quality – in fact in some instances we even improve visual quality as measured by VMAF and other high-level perceptual metrics.
In the long term, saving 30% of the 80% of the internet that is video traffic could result in data savings close to 25%. In the short term, proven video pre-processing algorithms are ready to roll, and could keep the internet alive during a period of unprecedented threat.