Creating Perceptually Optimized Videos in the Cloud

iSize Technical Articles

By Russell Anam, iSize Technologies

 

Encoding and delivery of high-quality video content is a defining trait of successful Video-on-Demand (VoD) platforms.

How do we define quality in video?

Traditionally, the metric for measuring video quality in the broadcast world has been the Peak Signal-to-Noise Ratio (PSNR), which quantifies a metric analogous to the peak power of the error induced in each video frame during encoding. However, it is now widely accepted that PSNR does not match human perception well.

For example, a video with grain noise may have the same PSNR as a blurry video, but the former will score higher in perceptual quality tests than the latter. Also, PSNR of a certain value, say 35dB, may mean high quality for one video, and very low quality for another video. On the other hand, high-level metrics like the Video Multi-Method Assessment Fusion (VMAF) are specifically designed with human perception in mind. In addition, VMAF is a “self-interpretable” metric, i.e. it ranges from 0 to 100 and represents a quality score against the source video.

In this article, we show how we can “precode” a video to a pixel representation that, if encoded by a standard cloud-based encoder in typical encoding conditions, that encoder will create a higher-quality video than encoding the original video directly with the same encoder and settings.

To complete the process described in this article you will need:

  1. A high quality source video (if nothing available, you can use examples from https://sample-videos.com/)
  2. A BitSave account (which comes free with £2 credit at BitSave.tech)
  3. A commercial cloud encoder (any such encoder can be used, but we’ll be using AWS MediaConvert in this article)

Cloud video encoding: The case of AWS MediaConvert

AWS Elemental MediaConvert is a video encoding service that allows on-demand transcoding of video files for broadcast or video processing. MediaConvert supports a variety of input video formats, Adaptive BitRate (ABR) encoding, Digital Rights Management (DRM) and ads insertion. By integrating with other AWS SDKs, MediaConvert can be part of complex video pipelines for video processing and delivery that involve multiple AWS services.


Figure 1: Basic MediaConvert workflow, with optional integrations with the Cloudfront and Rekognition AWS services.

Figure 1 shows a basic MediaConvert workflow. The service ingests input video source files either through S3, or through a non-authenticated public HTTP/S URL. The transcoding can be triggered either on-demand through the MediaConvert API, or it can be triggered automatically upon a video file uploaded to the input S3 folder through a Lambda function. After the transcoding is completed the output files are written in an S3 folder and the result is logged in CloudFront.

Optionally, the output bucket can be configured as an origin for the CloudFront content distribution network to provide video streaming services to end users. Other optional integrations can use Lambda functions to automatically trigger AWS Media Services, such as the AWS Rekognition service to perform object recognition.


Figure 2: Integration of the iSize BitSave precoding with the Mediaconvert ecosystem

iSize BitSave offers seamless and transparent integration with the AWS MediaConvert workflow by operating as an overlay precoding layer between the video producer and the AWS platform, as shown in Figure 2.  Instead of feeding the input video to MediaConvert, the video is fed to the iSize BitSave platform via public HTTP/S or Dropbox URL, which produces an intermediate video file. By providing the appropriate IAM (Identity & Access Management) permissions to the BitSave API, the precoded video can be uploaded directly to the MediaConvert S3 input folder.

After this point, the rest of the MediaConvert workflow remains exactly the same. The same lambda functions or the same API scripts will trigger the MediaConvert transcoding and post-processing services. Additionally, the iSize precoded video is compatible with all the MediaConvert presets and supported input formats and codecs.

Therefore, integrating the iSize precoder as part of a MediaConvert workflow happens simply by supplying the source video to the BitSave API. Alternatively, iSize BitSave can be push back the precoded video file to the MediaConvert client to follow the same uploading and precoding process as before. In this scenario, the video producer does not need to provide IAM permissions to the BitSave platform.

In this article, we show how you can supply a public URL and IAM credentials to BitSave precoding engine to generate the precoded video in your AWS bucket. You will then call AWS MediaConvert API to encode this precoded video to your desired format. This will give you the basic skills to automate the entire process and you can then customize this to your need including conversion to lambda function for even further automation.

High quality source video

The goal of encoding a video is to create a compressed representation that, with the smallest number of bits, can be decoded to produce a decoded video that is as close to perceptual quality of the original high-quality (source) video as possible. Since encoding cannot improve perceptual quality, it is usually best to start with the best-quality video available. So we recommend to start with as high-bitrate/high-quality source video as possible. We will use the publicly available Big Buck Bunny video in this tutorial.

Precoding

To use the iSize BitSave precoding engine, you first need to register and create a free BitSave account.

Just register with email and password and you should receive a verification email within a few seconds. Click the link in the email opening the BitSave dashboard and you are ready to proceed with free £2 credit. The dashboard allows you to upload and precode and even encode videos using the web interface. But to make things automated, we will use the BitSave API to push and retrieve video files from the BitSave SaaS site. To use the API, you need to get the access and API keys and the example source codes. This is easily done by clicking the API button on the UI.

 

We will grab the API example files from the link as well as generate a new access and API key with our current password.

Please save the keys in a secure location. You will need to provide these keys with every API request sent to BitSave.

Normally BitSave precodes and saves the file internally. But in this example, you want the precoded output to be copied to your own AWS S3 bucket. For this to be possible we will need to create an AWS IAM user with Programmatic access as discussed in this article and use the credentials (access key and secret key) of that account so that BitSave has the permission to push the intermediate output to your bucket.

The first phase to setting up a new job via the BitSave API is setting the key values in your scripts. PHP sample code for this can look like the following:

$accessKeyId = 'YOUR_BITSAVE_ACCESS_KEY'; 
$secretKey = 'YOUR_BITSAVE_API_KEY';
$bitsave_api_endpoint="https://api.bitsave.tech/encode/1.01/";

 

You also set the API endpoint that has the format of https://api.BitSave.tech/encode/API_VERSION/ and at the time of writing the API_VERSION is 1.01. The next phase is setting up the job details in an array:

$jobSetting = [
    "OutputGroups" => [
        [
            "OutputGroupSettings" => [      
                "FileGroupSettings" => [
                    "Destination_Type" => "s3", 
                    "Destination" => "s3://YOUR_AWS_BUCKET/"
                ]
            ],
            "AWSSettings" => [      
                "AccessKey" => "YOUR_AWS_ACCESS_KEY"
                "SecretKey" => "YOUR_AWS_SECRET_KEY"
                "Region" => "YOUR_AWS_BUCKET_REGION",
            ],
            "Outputs" => [
                [
                    "VideoDescription" => [
                        "ColorSpace" => "8", 
                        "CodecSettings" => [
                            "Codec" => "R&D", 
                            "Settings" => [
                                "GopSize" => 90, 
                                "GopSizeUnits" => "FRAMES", 
                                "FramerateControl" => "INITIALIZE_FROM_SOURCE",      
                            ]
                        ],
                    ],
                    "AudioDescriptions" => [
                        [
                            "AudioTypeControl" => "FOLLOW_INPUT", 
                            "CodecSettings" => [
                                "Codec" => "AAC", 
                                "AacSettings" => [
                                    "Specification" => "MPEG4", 
                                    "Bitrate" => 128000
                                ]
                            ],
                            "AudioSourceName" => "Audio Selector 1" 
                        ]
                    ],
                    "ContainerSettings" => [
                        "Container" => "MP4", 
                    ],
                    "Extension" => "mp4", 
                ]
            ]
        ]
    ],
    "Inputs" => [
        [
            "FileInput" => "http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_30fps_normal.mp4"
        ]
    ],
];

Replace the AWS key and bucket details with your own values. You then simply enclose this array as POST data and merge with the authentication tokens sending it to the BitSave endpoint. BitSave supports the following codecs:  H_264, H_264_M4V, MPEG2VIDEO, H_265, VP9, PRORES, PRECODER and R&D.

The R&D provides the highest quality (nearly lossless) but is limited to maximum video length of 15 minutes. We suggest that you use R&D for videos less than 15 minute and PRECODER for videos over 15 minutes when using BitSave SaaS as a precoder mechanism only. The enclosed BitSave_gen_auth.php generates the authentication tokens automatically so you don’t need to do anything. However, if you are implementing this in another programming language, you will have to write similar authentication mechanism for your scripts.

$post_data=[
    "Settings" => $jobSetting, 
];

$post_data=array_merge($post_data, $auth_fields);
$response=http_post_url($bitsave_api_endpoint, $post_data);

The response is a JSON response with either FileInput, JobID and FileOutput link or an error message. You will also get a HTTP 401 unauthorized response if you use invalid keys.

The precoding process itself is the most computationally intensive part so it takes some time to complete this. Your file is not available until the whole process is complete. You can check the status of your job using the job ID returned in the previous step with the following code:

$function="jobs";
$job_id="YOUR_BITSAVE_JOB_ID";
$endpoint_url=$bitsave_api_endpoint.$function.'/'.$job_id;
$post_data=$auth_fields;
$response=http_post_url($endpoint_url, $post_data);

The API endpoint can take on additional functions when data is passed in the format BITSAVE_API_ENDPOINT/FUNCTION/DATA. We modify the endpoint value so it looks like BITSAVE_API_ENDPOINT/jobs/JOB_ID so that is returns us relevant data about that particular job. Again, merging them with the authentication token and posting it, we can get JSON response about that job.

The returned fields are FileInput, JobID, FileOutput, Status, Phase, Start_Time and End_Time. When the “Status” field returns “COMPLETE”, it means the precoding is complete and the procoded file has been uploaded to your S3 bucket. By default, BitSave returns a one time download link to your file from internal BitSave storage, but since you started the job specifying our own S3 location, the returned link will be to your S3 bucket and file (S3://YOUR_BUCKET/precoded_file_name).

Now that you have your perceptually higher quality video, we need to encode it using any cloud video encoder. We will use AWS MediaConvert in this article but you are of course encouraged to use any cloud encoder, including BitSave itself.

To encode video using AWS MediaConvert you will need your AWS IAM user credentials, which you have already generated before and now you will need to create an IAM “Role” for MediaConvert. Details about creating the IAM role can be found in this article.  When you have all these, create a new credential variable and MediaConvert client supplying it with the credentials and your AWS region:

$credentials = new Credentials('YOUR_AWS_ACCESS_KEY', 'YOUR_AWS_SECRET_KEY');
$client = new Aws\MediaConvert\MediaConvertClient([
    'credentials' => $credentials,
    'version' => '2017-08-29',
    'region' => 'YOUR_AWS_REGION'
]);

The next step is to find your MediaConvert endpoint. This is a static endpoint associated to your account that you need to find via the API. The endpoint doesn’t change so you only need to find it once and then you can put the same value in every script.

try {
    $result = $client->describeEndpoints([]);
} catch (AwsException $e) {
    echo $e->getMessage();
}
$single_endpoint_url = $result['Endpoints'][0]['Url'];

With the actual endpoint, you can now create the proper MediaConvert client and then populate the job array just like you did before:

$mediaConvertClient = new MediaConvertClient([
    'credentials' => $credentials,
    'version' => '2017-08-29',
    'region' => 'YOUR_AWS_REGION',
    'endpoint' => $single_endpoint_url
]);
$jobSetting = [
    "OutputGroups" => [
        [
            "Name" => "File Group",
            "OutputGroupSettings" => [
                "Type" => "FILE_GROUP_SETTINGS",
                "FileGroupSettings" => [
                    "Destination" => "s3://YOUR_AWS_BUCKET/"
                ]
            ],
            "Outputs" => [
                [
                    "VideoDescription" => [
                        "CodecSettings" => [
                            "Codec" => "H_264",
                            "H264Settings" => [
                                "InterlaceMode" => "PROGRESSIVE",
                                "NumberReferenceFrames" => 3,
                                "Syntax" => "DEFAULT",
                                "Softness" => 0,
                                "GopClosedCadence" => 1,
                                "GopSize" => 90,
                                "Slices" => 1,
                                "SpatialAdaptiveQuantization" => "ENABLED",
                                "TemporalAdaptiveQuantization" => "ENABLED",
                                "FlickerAdaptiveQuantization" => "DISABLED",
                                "EntropyEncoding" => "CABAC",
                                "MaxBitrate" => 5000000,
                                "FramerateControl" => "INITIALIZE_FROM_SOURCE",
                                "RateControlMode" => "QVBR",
                                "QvbrSettings" => [
                                    "QvbrQualityLevel" => 9
                                ],
                                "CodecProfile" => "MAIN",
                                "Telecine" => "NONE",
                                "MinIInterval" => 0,
                                "AdaptiveQuantization" => "HIGH",
                                "CodecLevel" => "AUTO",
                                "FieldEncoding" => "PAFF",
                                "SceneChangeDetect" => "ENABLED",
                                "QualityTuningLevel" => "SINGLE_PASS",
                                "GopSizeUnits" => "FRAMES",
                            ]
                        ],
                        "DropFrameTimecode" => "ENABLED",
                        "ColorMetadata" => "INSERT"
                    ],
                    "AudioDescriptions" => [
                        [
                            "AudioTypeControl" => "FOLLOW_INPUT",
                            "CodecSettings" => [
                                "Codec" => "AAC",
                                "AacSettings" => [
                                    "AudioDescriptionBroadcasterMix" => "NORMAL",
                                    "RateControlMode" => "CBR",
                                    "CodecProfile" => "LC",
                                    "CodingMode" => "CODING_MODE_2_0",
                                    "RawFormat" => "NONE",
                                    "SampleRate" => 48000,
                                    "Specification" => "MPEG4",
                                    "Bitrate" => 128000
                                ]
                            ],
                            "LanguageCodeControl" => "FOLLOW_INPUT",
                            "AudioSourceName" => "Audio Selector 1"
                        ]
                    ],
                    "ContainerSettings" => [
                        "Container" => "MP4",
                        "Mp4Settings" => [
                            "CslgAtom" => "INCLUDE",
                            "FreeSpaceBox" => "EXCLUDE",
                            "MoovPlacement" => "PROGRESSIVE_DOWNLOAD"
                        ]
                    ],
                    "Extension" => "mp4",
                    "NameModifier" => "_encoded"
                ]
            ]
        ]
    ],
    "AdAvailOffset" => 0,
    "Inputs" => [
        [
            "AudioSelectors" => [
                "Audio Selector 1" => [
                    "Offset" => 0,
                    "DefaultSelection" => "NOT_DEFAULT",
                    "ProgramSelection" => 1,
                    "SelectorType" => "TRACK",
                    "Tracks" => [
                        1
                    ]
                ]
            ],
            "VideoSelector" => [
                "ColorSpace" => "FOLLOW"
            ],
            "FilterEnable" => "AUTO",
            "PsiControl" => "USE_PSI",
            "FilterStrength" => 0,
            "DeblockFilter" => "DISABLED",
            "DenoiseFilter" => "DISABLED",
            "TimecodeSource" => "EMBEDDED",
            "FileInput" => "S3_PATH_TO_BITSAVE_PRECODED_FILE"
        ]
    ],
    "TimecodeConfig" => [
        "Source" => "EMBEDDED"
    ]
];

The main values you set here are:

  • Codec: H264
  • FileInput: This is the output of the BitSave precoder which is the input here
  • RateControlMode => QVBR: This is the new AWS rate control mode for the best-quality output at the smallest file size
  • Destination: The S3 output bucket for this file in the format S3://BUCKET_NAME/
  • MaxBitrate: The maximum bitrate allowed to the encoded output video
  • NameModifier: The postfix string added to the input filename to rename the output

You then create and queue the job:

try {
    $result = $mediaConvertClient->createJob([
        "Role" => "arn:aws:iam::".$aws_iam.":role/".$aws_role,
        "Settings" => $jobSetting, 
        "Queue" => "arn:aws:mediaconvert:YOUR_AWS_REGION:YOUR_ACCOUNT_ID:queues/Default",
    ]);
} catch (AwsException $e) {
    echo $e->getMessage();
}

..replacing the region and account ID of your AWS account. You can find the account ID value in the role page you create earlier. If the job queue was successful, you will find the job ID in the following returned variable:

$job_id=$result["Job"]["Id"];

Just like precoding, the encoding process will take some time to complete. You can periodically check the status of the job using the following code:

try {
    $result = $mediaConvertClient->getJob([
        'Id' => "AWS_MEDIACONVERT_JOB_ID",
    ]);
} catch (AwsException $e) {
    echo $e->getMessage();
}
$job_status=$result["Job"]["Status"];

When the returned job status is “COMPLETE”, your final encoded file is available in your AWS bucket.

You can find the complete code for this article here. You can follow the process describe above with the supplied by doing the following:

  • Edit the bitsave_credentials.php file replacing the variable values with your own access and API key for both BitSave and AWS.
  • Edit the aws_credentials.php file replacing the variable values with your own AWS access and API key, AWS bucket name, region, account ID (IAM) and MediaConvert role.
  • Edit bitsave_createjob.php, changing the value of $bitsave_fileinput variable to the URL of the file you want to encode.
  • Run bitsave_createjob.php and if the job is queued successfully, it will ouput the BitSave job ID of the video to precode.
  • Edit bitsave_getjob.php, changing the value of $job_id to the one found in the previous step.
  • Run bitsave_getjob.php periodically. It will show you the status of the job. When the job is complete, it will output the S3 path of the file which will be the input of the next script. Like all API gateways, both BitSave and AWS limit the number of API requests you can make so run this job periodically with plenty of time in between. Recommended interval is 10 minutes.
  • Edit aws_createjob.php, putting the S3 path from the previous step into the variable $aws_input_file.
  • Run aws_createjob.php. If the job was queued successfully, it will output “Job xxxxx-xxx queued successfully.”
  • You can use this job ID putting it in aws_getjob.php to get the status of the job. Run this periodically just like before. When the job is COMPLETE, your encoded file is available in the output S3 bucket.

Another set of code is available here that automates a lot of the above functionality by saving the intermediate values in a SQLite database. You can use any database, we only show SQLite example as it is very easy to use, portable and does not requires zero or little installation.

Windows (uncomment the following in php.ini):

extension=sqlite3

Linux Distro (Ubuntu):

sudo apt-get install php-sqlite3

Happy testing & encoding, and get in touch with info@isize.co for any questions!