diff --git a/index.html.tmpl b/index.html.tmpl index d0f0c61..dbb1f5f 100644 --- a/index.html.tmpl +++ b/index.html.tmpl @@ -5,7 +5,7 @@ cfebs.com${more_title} - + diff --git a/posts/package_my_video.md b/posts/package_my_video.md new file mode 100644 index 0000000..f1ff7ff --- /dev/null +++ b/posts/package_my_video.md @@ -0,0 +1,278 @@ +Title: Package my video +Date: 2024-06-20T13:19:03-04:00 +Draft: 1 +--- + +## TODO + +* Some intro +* Investigate key frames and seeking + +-------------------------------------------------------------------------------- + +The goal of this project is to get a minimum viable product of: + +* support H264 video and AAC audio + * experiment with AV1 and Opus if time permits +* produce a .mp4 file that is optimized for: raw progressive playback and segment generation +* generate a HLS playlist on demand +* generate the Nth segment from a HLS file on demand + +## Setup + +### ffmpeg + +First I want to start with the latest available ffmpeg static build from: + +```shell +❯ curl -sL -O https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz +❯ sudo tar -C /opt -xvf ffmpeg-release-amd64-static.tar.xz +``` + +My preference is to then link my preferred build to some location (`/opt/ffmpeg-static`) that I will then add to my `PATH`. + +``` +❯ sudo ln -sf /opt/ffmpeg-7.0.1-amd64-static/ /opt/ffmpeg-static +# then edit your shell rc or profile, reset shell +❯ type ffmpeg +ffmpeg is /opt/ffmpeg-static/ffmpeg + +❯ ffmpeg -version +ffmpeg version 7.0.1-static https://johnvansickle.com/ffmpeg/ Copyright (c) 2000-2024 the FFmpeg developers +built with gcc 8 (Debian 8.3.0-6) +configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi --enable-libzimg +libavutil 59. 8.100 / 59. 8.100 +libavcodec 61. 3.100 / 61. 3.100 +libavformat 61. 1.100 / 61. 1.100 +libavdevice 61. 1.100 / 61. 1.100 +libavfilter 10. 1.100 / 10. 1.100 +libswscale 8. 1.100 / 8. 1.100 +libswresample 5. 1.100 / 5. 1.100 +libpostproc 58. 1.100 / 58. 1.100 +``` + +And checking codec support +``` +❯ ffmpeg -codecs 2>/dev/null | grep '\s\(aac\|h264\|av1\|opus\)' + DEV.L. av1 Alliance for Open Media AV1 (decoders: libdav1d libaom-av1 av1) (encoders: libaom-av1) + DEV.LS h264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (decoders: h264 h264_v4l2m2m) (encoders: libx264 libx264rgb h264_v4l2m2m) + DEAIL. aac AAC (Advanced Audio Coding) (decoders: aac aac_fixed) + D.AIL. aac_latm AAC LATM (Advanced Audio Coding LATM syntax) + DEAIL. opus Opus (Opus Interactive Audio Codec) (decoders: opus libopus) (encoders: opus libopus) +``` + +### Test video file + +Here are a few and open test video sources: + +* Sintel: + * License: [Creative Commons Attribution 3.0](https://web.archive.org/web/20240105060647/https://durian.blender.org/sharing/) +* Big Buck Bunny: + * License: [Creative Commons Attribution 3.0](https://web.archive.org/web/20240521095028/https://peach.blender.org/about/) + +I grabbed a 720p version of each + +```shell +❯ du -h test-videos/* +398M test-videos/big_buck_bunny_720p_h264.mov +650M test-videos/Sintel.2010.720p.mkv +``` + +### Deciding on eventual segment size + +Target segment size will hold some influence over our progressive transcoding. + +Each segment will begin with at least 1 key frame, so our progressive output key frame placement should line up with where our segments will be extracted. + +Apple [suggests 6 second durations for each HLS segment][apple_hls_seg] for VOD playback with HLS. + +6s would be fine to use, but it's a choice with consequences. + +If there was a desire to use a 3s segment instead, the progressive file would need to re-transcode to insert more key frames. + +So for flexibility's sake will choose 3s for key frames in the progressive transcode, but eventual segments will be 6s. + +### Packaging the progressive file + +But first, let's produce v1 of the files with the target codecs applied (H264 and AAC). + +``` +❯ ffmpeg -i ./test-videos/big_buck_bunny_720p_h264.mov -acodec 'aac' -vcodec 'h264' ./test-videos/bbb_h264_aac.mp4 + +❯ ffmpeg -i ./test-videos/Sintel.2010.720p.mkv -acodec 'aac' -vcodec 'h264' ./test-videos/sintel_h264_aac.mp4 + +❯ du -h ./test-videos/* +138M ./test-videos/bbb_h264_aac.mp4 +398M ./test-videos/big_buck_bunny_720p_h264.mov +650M ./test-videos/Sintel.2010.720p.mkv +201M ./test-videos/sintel_h264_aac.mp4 +``` + +Now lets inspect the frames a bit closer with this script `dumpframes.sh` +```bash +#!/usr/bin/env bash + +ffprobe -select_streams v -show_frames -show_entries frame=pict_type -of csv $1 +``` + +This should show what [picture type][pic_types] each frame is. + +> I‑frames are the least compressible but don't require other video frames to decode. + +> P‑frames can use data from previous frames to decompress and are more compressible than I‑frames. + +> B‑frames can use both previous and forward frames for data reference to get the highest amount of data compression. + +**I** frames are also called **key frames**. + +So given a dump of Big Buck Bunny (BBB): +``` +./dumpframes.sh test-videos/bbb_h264_aac.mp4 > bbb_frames.csv +``` + +BBB is 24fps, so every 3 seconds we want to see a key frame. Here are the frame numbers where the key frame should be. +``` +❯ python3 +>>> fps = 24 +>>> i_frame_s = 3 + +>>> for i in range(0, 10): print(i * fps * i_frame_s + 1) +... +1 +73 +145 +217 +289 +361 +433 +505 +577 +649 +``` + +* First segment should contain 3s of content which will be 72 frames. +* First frame should be a key frame. +* Then 71 non I frames. +* Then the next I frame (frame 73) begins the next segment. + +But our frames are not quite right. + +```shell +❯ grep -n I ./bbb_frames.csv | head +1:frame,I,H.26[45] User Data Unregistered SEI message +7:frame,I +251:frame,I +286:frame,I +379:frame,I +554:frame,I +804:frame,I +1054:frame,I +1146:frame,I +1347:frame,I +``` + +This is because we didn't tell ffmpeg anything about how to encode and where to place I frames. + +There are a few options to `libx264` that help control this: + +* `--no-scenecut`: "Disable adaptive I-frame decision" +* `--keyint`: effectively the key frame interval. Technically it is the "Maximum GOP size" +* `--min-keyint`: the "Minimum GOP size" + +> A GOP is "Group of Pictures" or the distance between two key frames. + +So lets re-encode with those options. Actually lets write a wrapper script to do this. + +I'll choose something besides bash because there will be a bit of math involved. + +```python +#!/usr/bin/env python +import sys +import json +import subprocess +import logging + +logging.basicConfig(level=logging.DEBUG) + +def probe_info(infname): + cmd = f'ffprobe -v quiet -print_format json -show_format -show_streams {infname}'.split(' ') + res = subprocess.run(cmd, check=False, capture_output=True) + logging.info('running cmd %s', ' '.join(cmd)) + ffprobe_dict = json.loads(res.stdout) + v_stream = None + for stream in ffprobe_dict.get('streams'): + if stream.get('codec_type') == 'video': + v_stream = stream + break + + r_frame_rate = v_stream.get('r_frame_rate') + num, denom = r_frame_rate.split('/') + fps = float(num) / float(denom) + logging.info('got fps %s', fps) + return { + 'fps': fps, + } + +def run_ffmpeg_transcode(infname, outfname, probeinfo, segment_length=3): + # must be an integer + keyint = int(probeinfo.get('fps') * segment_length) + cmd = [ + 'ffmpeg', + '-i', + infname, + '-vcodec', + 'libx264', + '-x264opts', + f'keyint={keyint}:min-keyint={keyint}:no-scenecut', + '-acodec', + 'aac', + outfname + ] + logging.info('running cmd %s', ' '.join(cmd)) + subprocess.run(cmd, check=True) + +if __name__ == '__main__': + args = sys.argv + prog = args.pop(0) + if len(args) != 2: + sys.exit(1) + + infname, outfname = args + probeinfo = probe_info(infname) + run_ffmpeg_transcode(infname, outfname, probeinfo) +``` + +* Use `ffprobe` to dump the streams as json. +* Take first video stream. +* Get the `r_frame_rate` which is a fraction. Eval the fraction as `fps` +* Calculate the keyframe interval using a static 3s segment length. + +And if we run it, lets take a look at the cmds it executes: +``` +❯ ./progressive.py ./test-videos/big_buck_bunny_720p_h264.mov ./test-videos/bbb_h264_aac.mp4 +INFO:root:running cmd ffprobe -v quiet -print_format json -show_format -show_streams ./test-videos/big_buck_bunny_720p_h264.mov +INFO:root:got fps 24.0 +INFO:root:running cmd ffmpeg -i ./test-videos/big_buck_bunny_720p_h264.mov -vcodec libx264 -x264opts keyint=72:min-keyint=72:no-scenecut -acodec aac ./test-videos/bbb_h264_aac.mp4 +``` + +Now regenerate the frame dump and check if our I frames match the expected: 1, 73, 145, 217 ... + +```shell +❯ ./dumpframes.sh test-videos/bbb_h264_aac.mp4 > bbb_frames.csv +❯ grep -n I ./bbb_vimeo_frames.csv | head +1:frame,I,H.26[45] User Data Unregistered SEI message +73:frame,I +145:frame,I +217:frame,I +286:frame,I +289:frame,I +361:frame,I +379:frame,I +433:frame,I +505:frame,I +``` + +Excellent! + +[pic_types]: https://en.wikipedia.org/wiki/Video_compression_picture_types +[apple_hls_seg]: https://developer.apple.com/documentation/http-live-streaming/hls-authoring-specification-for-apple-devices#Media-Segmentation diff --git a/style.css b/style.css index f5bc84d..abd1a6c 100644 --- a/style.css +++ b/style.css @@ -489,7 +489,7 @@ hr { margin-top: 1rem; margin-bottom: 1rem; border: 0; - border-top: 1px solid rgba(0, 0, 0, .1) + border-top: 1px solid rgba(255, 255, 255, .1) } small,