Testing x264 b-frames heuristic

Last week I wrote about my x264 patch that would enable B-frames heuristic decisions in x264.

Today I finally managed to do some proper testing using the Big Buck Bunny PNG image sequence from xiph.org, and found out that for reasonable settings (b-frames 8, b-heuristic 4) we have an overall 4% speed gain and for more extreme settings (b-frames 16, b-heuristic 4) we gain 17% overall.

The base command line used in this test is –b-adapt 2 –b-pyramid -r 4 –crf 26 -A all –direct auto -w -m 9 –psy-rd –mixed-refs -8 -t 2 –threads 3 –thread-input.

In the following pretty graph™ I plotted the frames-per-second for various settings (b-frames/b-heuristic):

Spped results for different x264 encodes

The following are the raw results taken from the x264 logs:

settings PSNR SSIM
b-frames b-heuristic fps kb/s Y avg global
4 0 12,62 336,03 39,644 40,377 38,614 0,9686156
4 2 12,67 336,95 39,650 40,383 38,617 0,9686125
8 0 11,16 333,68 39,604 40,330 38,583 0,9683932
8 2 12,07 335,57 39,631 40,356 38,598 0,9685096
8 4 11,64 334,02 39,613 40,338 38,588 0,9684230
16 0 8,43 331,42 39,588 40,313 38,566 0,9682979
16 2 10,17 334,85 39,619 40,341 38,590 0,9684408
16 4 9,93 333,50 39,607 40,330 38,585 0,9683999
16 8 9,31 333,11 39,598 40,322 38,579 0,9683561

x264 adaptive max b-frames patch

I just put together a patch (it’s more of an ugly hack but whatever) that makes –b-adapt 2 somewhat faster.

Basically what it does is to adapt the number of b-frames to look for based upon the length of the previous consecutive b-frames span. It is way easier just to read the patch to find out how it works; this is the relevant part:

+    /* adaptive max b-frames */
+    if( h->sh.i_type == SLICE_TYPE_P )
+    {
+        int i_gop_bframes = h->fdec->i_frame - h->fref0[0]->i_frame - 1;
+        const int i_bframes_overhead = 4;
+        if ( i_gop_bframes + i_bframes_overhead > h->frames.i_adapt_bframes )
+            h->frames.i_adapt_bframes = i_gop_bframes + i_bframes_overhead;
+        else
+            h->frames.i_adapt_bframes--;
+        h->frames.i_adapt_bframes = x264_clip3( h->frames.i_adapt_bframes, 0, h->param.i_bframe );
+    }

i_gop_bframes is the length of the previous span of consecutive b-frames and i_bframes_overhead is an arbitrary constant (possibly to be parametrized).

h->frames.i_adapt_bframes is initialized (and reset at every scencut) to h->param.i_bframe (i.e. the number of b-frames specified on the command line).

The same adaptive method can be easily added also to –b-adapt 0 and 1, but I guess that it wouldn’t give the same speedup.

Right now I’m running a few tests and I will post further results briefly, but the first tests showed a speedup ranging from 5 to 30% (being heavily dependent of the content of the video) for -b 16 (I know that this is the best-case scenario as well as I know the tendancy of doom9ers to max out their settings ).

Just a few more warnings: I tested it only on a Ubuntu32 VM and I have not paid much attention to concurrency issues (there should be none, and I experienced no crashes or other misbehaviour – but I’ll wait for a review from the x264 devs about this). I tried to be as clear as possible in the patch, but I guess that at least a few variable names and comments will have to be edited… I couldn’t come out with better ones, though.

That’s all. Feel free to comment and give (possibly) constructive advice.