HTML 5 Video and Images for Web

I like being efficient with my bits and bytes. I bought into the webp image format train pretty early, adopting it when I made this blog.
Recently I made a video longer than 2 seconds, so I started looking properly into codecs like VP9 and AV1.
Lets take a look into some video and image codecs and how the new shiny ones can be used.

EDIT 28-06-22: AV1 is actually viable now! #

As of FFMpeg v5, the SVT-AV1 encoder is available and is fast.
It’s slower than the others, but it gives smaller sizes for similar quality.

ffmpeg -i input.mp4 -b:v 0 -c:v libsvtav1 -qp 50 -preset 3 out_AV1.mp4

-qp is what CRF is currently called, and will become CRF in FFMpeg 5.1 Preset default is 13, and is SUPER fast, but poor quality. Lower numbers = slower, but look better

EDIT 11-9-20: AV1 is faster now! #

It feels like many changes have been made, and FFMpeg 4.3.1 now makes AV1’s faster than before (subjectively at least).
Still takes a while, but not a crushingly long time.
See here.


I want to avoid sending an 8mb png or a 70mb video down to the client if I can.
New compression formats such as AV1 and webp might be able to help with that.

Quick aside, MP4/MKV are containers, boxes that contain codecs (mp3, h264, VP9).
Codecs are the things I’m looking at here.

To be clear, all of this is based off this post. Much better comparisons than mine.

TL;DR #

Images and Videos on the web #

HTML5 controls support newer formats by allowing the developer to provide several formats and allow the client to only pull the bytes for the first supported format.

HTML5 Images #

<picture>
    <source srcset="path-to-image.webp" type="image/webp">
    <source srcset="path-to-image.png" type="image/png">
    <img src="path-to-image.png">
</picture>

The above will let the client pull the webp if it’s supported and won’t download the png. If webp isn’t supported by the browser, it’ll fallback to the png. In that case, the webp won’t be downloaded at all, as the browser knows it doesn’t support it.

That img tag is required. Not 100% why, but it is.

HTML5 Video #

<video autoplay muted playsinline loop>
    <source src="path-to-AV1.mp4" type="video/mp4; codecs=av01.0.05M.08">
    <source src="path-to-VP9.webm" type="video/webm; codecs=vp9">
    <source src="path-to-H264.mp4" type="video/mp4">
    This message is displayed when none are supported
</video>

Similarly to the picture tag, this allows the browser to only download the video codec it supports in the order of declaration.
That weird codec string is how the browser identifies the AV1 codec.
This tag also supports those usual attributes like autoplay and loop.

Now that we’ve seen the components that can use new shiny things, let’s look at the new shiny things.

The Setup #

Video CodecsAV1, h264, VP9
Image Codecswebp , png, jpg
Video SourcesPNGs from a Blender animation: ~8mb each, 1100 of them. Result
Image SourcesThe Floating Rock render PNG and Sword
ToolsFFMpeg and the Webp tools from google because they’re common, simple, and reasonably well documented.

Edit: Using FFMpeg 4.2.1

The Plan #

I want to provide the best tradeoff between size and quality to readers, while still providing compatibilty to those without the shiny toys (often mobile users).
This involves having a widely supported codec available, but providing better ones to those who can use them.
It’s encoding time!

Image Codecs #

Generally speaking, png’s are a lot of unnecessary bytes to transfer, and while jpg is less bytes, it suffers visually.
The only one I really considered was Webp as AVIF isn’t supported yet and webp is widely supported.

Webp #

Webp is a Google driven format based on their VP8 (VP9 now?) video compression, just being applied to a single image.
It’s USUALLY a fair bit smaller than a jpg with better quality.
I say usually because sometimes it can be larger, but generally it’s smaller.

Results #

Floating Rock

FormatSize (kb)
png3599
jpg558
webp311

Sword

FormatSize (kb)
png8022
jpg209
webp87

Command Line #

cwebp sword.png -o sword.webp -mt -m 6 -pass 10 -q 90

Conclusion #

I’m definitely using webp. There’s no reason not to. Occasionally there’s one that’s larger than the png by a bit, but usually it’s significantly less and has less blocky artifacts than JPG.

Video #

CodecAdoptionSizeQualityEncoding SpeedNotes
h264Common and widespreadReferenceGoodGoodGood balance between encoding speed, filesize, and quality
VP9Generally well supportedOften 10x smallerDecentMuch slowerMuch smaller with limited hit to quality, but much slower encode speed.
AV1Desktop, not mobileOften 10x smallerGoodSooo much slowerSmaller, better quality, but limited support and really slow encode time

There’s more interesting tradeoffs for video, since you don’t want to sit around until the heat death of the universe waiting for that perfect AV1 encode.
AV1 is really slow to encode right now, like 500x slower. VP9 is faster, but does suffer jpg-like blocking artifacts at lower bitrates.

AV1 10 and 12 bit colour don’t seem to work in browsers. The stutter is real. The PNG’s I had were apparently 12 bit colour (yuv444p12le), and the default for FFMpeg was to use the input pixel format. Use -pix_fmt yuv444p (or yuv420p/yuv422p) to workaround.

CodecSize (mb)Encode Time (mins)
h26415.62
VP94.8612
AV12.06140

Command lines #

Note that this is slightly different from the usual commands around, since I’m using an image sequence instead of another video.

Common notes #

H264 #

ffmpeg -framerate 30 -i %04d.png -vf scale=-1:720:flags=lanczos -c:v libx264 -b:v 0 -crf 35 -movflags +faststart output.mp4  

VP9 #

ffmpeg -framerate 30 -i %04d.png -vf scale=-1:720:flags=lanczos -c:v libvpx-vp9 -b:v 0 -crf 35 -deadline best -row-mt 1 -tile-columns 2 -threads 8 output.webm

I’ve had a lot better results from two-pass in target bitrate mode.

VP9 Two Pass #

Pass 1:

ffmpeg -framerate 30 -i %04d.png -vf scale=-1:720:flags=lanczos -c:v libvpx-vp9 -b:v 800k -pass 1 -f webm emptyfile

Pass 2:

ffmpeg -framerate 30 -i %04d.png -vf scale=-1:720:flags=lanczos -c:v libvpx-vp9 -b:v 800k -pass 2 output.webm

Silly Windows and FFMpeg docs. Apparently you should be able to go -pass 1 NUL && ^ but I can’t get that to work, so generating a temporary file instead.

AV1 #

ffmpeg -framerate 30 -i %04d.png -vf scale=-1:720:flags=lanczos -c:v libaom-av1 -b:v 0 -crf 35 -strict experimental -row-mt 1 -cpu-used 5 -tile-columns 2 -threads 8 -pix_fmt yuv444p -movflags +faststart output.mp4

Note the experimental flag since AV1 is new

Conclusions #

AV1 is not CURRENTLY worth unless you have videos shorter than 30 seconds OR you find that balance between quality and filesize before the entropic death of the world.
VP9 is faster to encode with similarly reduced file sizes.
There’s many projects looking into improving the encode speed; libaom is still experimental for now, so once it comes properly, it might become worth.

Summary #

Major resources #