After seeing the word ‘precoding’ emblazoned on the website of video compression start-up iSize Technologies, Faultline dialed into this week’s briefing with immense skepticism.
An hour later, we left having been mostly convinced that video precoding is much more than a marketing construct – albeit one with opportunities of indeterminate size given the complexities of the video compression landscape.
iSize is coming at the AI-based video encoding market from a different angle, one that does not directly compete with the established encoding vendors or even smaller video encoding firms specialized in machine learning. But that does not necessarily mean iSize’s approach is exempt from hostility from these camps, akin to our initial skepticism.
As with anything potentially disruptive, iSize has been met with strong resistance.
The London-based vendor pitches itself as the first company to offer proprietary machine learning technologies for substantial bitrate or quality gains in video compression. iSize sets itself apart on the promise of compatibility with any existing video coding infrastructure – meaning it can boost the compression efficiency of any codec and run on client devices with minimal overheads, while offering additional computational and energy efficiency improvements.
iSize CTO Yiannis Andreopoulos, who doubles up as a professor of data and signal processing systems at UCL, explained to Faultline that there is an opportunity in supplying high quality content to viewers with a lower environmental footprint.
The result is reducing streaming bandwidth by an impressive 40% – even before it hits the encoder.
The iSize BitSave product is based on proprietary deep perceptual optimization and precoding technologies as a preprocessing stage of a standard codec pipeline. As shown in the image below, iSize’s server/encoder-side technology preprocesses the input content with a custom-designed deep neural network – in way that does not abandon the existing codec pipeline.
Substantial bitrate savings can therefore be achieved without having to make changes to encoding, packaging, or decoding, as well as adequate quality improvements (such as a 6 to 10-point increase in VMAF or similar high-level perceptual quality metrics).
While iSize has coined the term precoding to describe using deep perceptual optimization techniques at the pre-encoding stage, this could quite easily be described as a preprocessing technique. However, Andreopoulos insists this proprietary technology deserves its own descriptive and marketable term. The difference between precoding and preprocessing, he claims, is that iSize’s technology is state-aware – making content look better and making content easier to encode.
“This is a game of enhancement that makes the job of the encoder easier,” he explained.
So what’s stopping these enhancements being achieved inside the encoder? According to Andreopoulos, it is impossible to do what iSize does inside the encoding loop, so neural networks must be trained outside of traditional encoders. Some encoders can achieve something close, he admitted, but not a deep level.
iSize has identified an opportunity from groups like AOMedia, with substantial financial backing, which are also asking the same question – how do you get deep neural networks to interact with established (AVC, HEVC, AV1, VP9) and emerging (VVC, EVC, AV2) video codecs?
Part of the answer lies in moving away from PSNR to perceptual quality – using open metrics like SSIM (structural similarity index measurement) and VMAF (Video Multimethod Assessment Fusion – developed by Netflix).
iSize is therefore much closer competitively to a company like SSIMWave, than to any software or hardware-based encoding provider. While Andreopoulos described SSIM as an amazing metric, he criticized SSIMWave for defining its own bespoke metric and optimizing for this metric – using the analogy that the company is essentially cooking its own food.
SSIMWave is not taking on a healthy, balanced diet, while iSize prefers to use an array of flavors, mixing open metrics in the pub- lic domain that help with inspecting third party data.
That said, SSIMWave recently launched its new Video Quality Dial technology, which – like iSize’s BitSave – also works outside of the encoder, but is positioned as a “smart layer” around a video provider’s existing cloud encoder rather than a precoder, where it automatically selects the lowest possible bitrate to deliver the desired Viewer Score based on the 0-100 SSIMPlus scale.
The result is bitrate reduction as high as 50% for VoD content, which SSIMWave says equates to $millions in savings every year for services with north of 5 million subscribers.
SSIMWave and iSize share more than a measurement metric. Neither company can talk about customers yet, which speaks volumes about the difficulty in succeeding in this market. Yes, the big boys are all interested and sniffing around, but you need the backing of these big guns from the get-go. You cannot start with small fry customers and work your way up.
We spoke about the possibility that SSIMWave has already hoovered up much of the addressable market, after SSIMWave boasted to Faultline only a few weeks ago about its tier 1 operator deployment base. Andreopoulos accepted SSIMWave might have cornered a sizable share today, but he sees this as the tip of the iceberg. “We like complementary offerings, and we are using a combination of tools. We are on the server-side only so there is no customization required,” he added.
iSize is targeting cost savings of 3% to 5% every quarter for customers, which is potentially huge when scaled out across all encoders for a major video streaming provider.
While customers are under NDA, iSize can discuss a sizable partnership. With iSize capable of reducing encoding complexity, thereby saving datacenter processing power and energy consumption, none other than Intel recently brought in iSize’s BitSave precoding technology. Results showed up to five times faster speed performance on Intel’s AI-enabled CPUs – saving $176 per hour for every 5,000 streams.
iSize has worked closely with Intel to optimize its AI models for Xeon Scalable processors with Deep Learning Boost and the Intel Distribution of OpenVINO toolkit, powered by oneAPI.
iSize’s codec-agnostic approach led to LCEVC coming up in conversation – the recently completed MPEG-5 Part 2 standard for reducing computational complexity for any codec. Andreopoulos dismissed any potential competitive overlap with LCEVC, due to this compression technology focusing on more traditional PSNR techniques, not on perceptuality, as well as not being a precoding method.
There is no reason why LCEVC and iSize could not exist together in a workflow, applied at different stages where the latter could even make life easier for the former.
iSize is also looking at making its proprietary technology adjustable so it switches itself off, which Andreopoulos compared to what encoding companies call content-adaptive or context-aware encoding. Training your system in this way makes it more robust and comes with very little overheads, especially for live channels.
As for iSize’s machine learning algorithms, it boasts being the only outfit on the market with proprietary technology. Andreopoulos explained that the proprietary part lies in how iSize trains the algorithms, in addition to what the encoder will do to the content.
“We see perceptual optimization as an extra dimension when you make it learnable. We want to be deployed across the board and reverse engineer perceptual quality,” he said.
For VoD encoders, he described solving an issue of maniacal en- coding, where the constraints of current systems often means content is encoded on a per-title basis over and over, which churns out great business for VoD encoding companies, but is highly inefficient.
The 8-person team was founded in 2016, while Andreopoulos got involved around two years ago, dipping his toes into upscaling before getting his hands dirty in the CTO position as iSize shifted focus by zooming in on perceptual quality about a year and a half ago.
Getting its big first customer over the line is a huge barrier for a startup like iSize, suffering from the same teething problems that stalled the likes of V-Nova and SSIMWave. The recent successes of these two alone are enough to prevent us writing iSize off.