Sunday 9 February 2014

AForge, FFmpeg and H.264 Codec - Default Settings Problems

If you've read my previous blog post you;d be aware that I'm currently in the process of creating mp4 videos encoded with the H.264 codec and to do that, I'm using AForge.NET. Unfortunately, the functionality to encode in H.264 isn't readily available and I went through the steps to enable it in that previous blog post.

However, nothing is ever as easy as it should be and as soon as I had enabled it, I got the following error message:
"broken ffmpeg default settings detected"

After a bit of research I found the cause of the problem and unsurprisingly, it's exactly what it says on the tin. The defaults settings being sent to the codec are broken. In actual fact, the default settings set by the FFmpeg library (that's the library which AForge.NET wraps around) are a load of rubbish. If we want to get this working then we're going to need to set some sensible defaults.

If you open up the Video.FFMPEG project from the AForge.NET solution (the one found here) and open up VideoFileWriter.cpp and find the add_video_stream method, you should see an if statement that looks like this:


if (codecContex->codec_id == libffmpeg::CODEC_ID_MPEG1VIDEO)
{
    codecContex->mb_decision = 2;
}


We can now add to this if statement and set up some default values which will work like so:


if (codecContex->codec_id == libffmpeg::CODEC_ID_MPEG1VIDEO)
{
    codecContex->mb_decision = 2;
}
else if(codecContex->codec_id == libffmpeg::CODEC_ID_H264)
{
    codecContex->bit_rate_tolerance = 0;
    codecContex->rc_max_rate = 0;
    codecContex->rc_buffer_size = 0;
    codecContex->gop_size = 40;
    codecContex->max_b_frames = 3;
    codecContex->b_frame_strategy = 1;
    codecContex->coder_type = 1;
    codecContex->me_cmp = 1;
    codecContex->me_range = 16;
    codecContex->qmin = 10;
    codecContex->qmax = 51;
    codecContex->scenechange_threshold = 40;
    codecContex->flags |= CODEC_FLAG_LOOP_FILTER;
    codecContex->me_subpel_quality = 5;
    codecContex->i_quant_factor = 0.71;
    codecContex->qcompress = 0.6;
    codecContex->max_qdiff = 4;
    codecContex->directpred = 1;
    codecContex->flags2 |= CODEC_FLAG2_FASTPSKIP;
}


If you now compile that and use the resulting DLL in your project, you'll see the error has gone!

But.... as always, it's not that simple! I got to this stage and when I was just using a simple bitmap image to create a very simple (and very short) video I'd get the following warning during every frame that I sent to be encoded:
non-strictly-monotonic PTS

However, it didn't seem to have any effect, my video file was still created and played so I thought it wouldn't really matter. I was wrong.

When I put the DLL into my final project that involves creating much larger movies, the program would just randomly crash. I say randomly because there was no real consistency to it. At different times during the writing process the WriteVideoFrame method would throw an exception and that'd be the end of that.

On that basis, I thought it be best that I resolve this "PTS" warning and see if that solves the problem. But, what on earth is a "non-strictly-monotonic PTS"? That's a good question which I hope to answer in my next blog after I've fully understood it myself!

Monday 3 February 2014

HTML5 - Video and Encoding

I've recently thought I'd dive into the world of showing video online and being an up to date web developer, I don't want to be using no flash stuff.... I want to use the latest and greatest HTML5 video tags after all, it's meant to be easy right?

Wrong. Well, kind of.

If you have a video that is in the right format and encoded with the correct codec (take a look at w3schools for a list of them), then it is actually very simple, you can use the HTML5 Video tag like so:
HTML5 - The Future


<video width="320" height="240" controls>
  <source src="movie.mp4" type="video/mp4">
  <source src="movie.ogg" type="video/ogg">
Your browser does not support the video tag.
</video>


The multiple sources allow you to define different formats of the same video. The browser will go down the list until it finds a format it can play. If it finds a playable format then it'll do just that.

However, what if you don't have a video in the correct format? What if you're trying to generate your own content, on the fly using a simple web cam on your laptop? Surely saving a video in the format you want is pretty straight forward?

Wrong.

Let's take you through the dark and nasty world of video's in managed code but first, let's give you some idea of what I'm trying to achieve.
I've just been given a Raspberry Pi with a camera module (a great Christmas present by the way) so I thought I'd go about and set up a little home CCTV system. To go a step further, I want the system to be able to detect movement and at that point, start uploading a live feed to a website, where I can then log on and view this live feed. I've also got a couple of laptops around the house equipped with web cameras so my plan is to use them as extra cameras to the system, when they're turned on.

That's the simple brief. I say simple, when you scratch beneath the surface, it gets complicated. The laptops are on various versions of Windows (Windows 7 and Windows Vista) with various versions of the .NET framework installed. The Pi runs on Raspbian which is a port of Debian wheezy which is of course, a version of Linux. So we've got different OS versions with different architectures. Because of these complexities, I want to make this little system with managed code using the .NET Framework. There are quite a few challenges to over come here and I don't want the fundamentals of a language I don't really know to be getting in the way, so I'm going to play it safe and stick with what I know.

Now at this point, I should say this is a work in progress, this project isn't completed by a long shot but I thought I'd blog about the problems I encounter as and when I encounter them.

So, for the time being at least, I'm going to ignore the Raspberry Pi camera module, I'll come back to that later. I haven't done the necessary research but I suspect Mono (the cross platform, open source .NET development framework) won't support the necessary libraries I need to use to be able to capture video feeds but I have a cunning plan for that... that, however, is for a separate blog post. For now I just want to be able to capture a video feed from one of my laptops.

So, where to start?

I said this system should detect movement. To do that I need to compare a frame from one moment in time to a frame in another and if there's a difference then something has moved. Fortunately, there's some great blog posts around movement detection algorithms and I implemented one that's shown here: http://www.codeproject.com/Articles/10248/Motion-Detection-Algorithms

As you go through the above post  you'll notice it has the option of writing to file. Great!
You'll then notice it's writes it as an AVI file. Bad!

AVI uses the Windows Media Video 9 VCM codec. The word "Windows" in there should give you a pretty good indication that browser vendors like Google aren't going to support it and you'd be right. It's not a supported codec for HTML5 Videos and browsers like Chrome and Safari won't play it.

So how we go about saving this thing in a format that is supported by most browsers? In particular, how do we save this thing in mp4 format encoded with H.264?

Well, the motion detection algorithm uses a framework called the AForge.NET Framework. This is a very powerful framework and as their website states, it's a "C# framework designed for developers and researchers in the fields of Computer Vision and Artificial Intelligence - image processing, neural networks, genetic algorithms, machine learning, robotics, etc.". I'm particularly interested in the "image processing" part of that.

As it turns out, AForge has a library called AForge.Video.FFMPEG. This is a managed code wrapper around the FFMPEG library. This library has a class called "VideoFileWriter" and it seems like we're on to something here. It has an Open method with the following specification:


public void Open(string fileName, int width, int height, int frameRate, VideoCodec codec);


That last parameter allows you to define a VideoCodec to encode it with. Great! Now we're getting somewhere. Surely all we need to do is set that to H264 and we're there! VideoCodec is an enum so let's check out it's definition.


public enum VideoCodec {
    Default = -1,
    MPEG4 = 0,
    WMV1 = 1,
    WMV2 = 2,
    MSMPEG4v2 = 3,
    MSMPEG4v3 = 4,
    H263P = 5,
    FLV1 = 6,
    MPEG2 = 7,
    Raw = 8
}


What?! No H264? To make matters worse, none of those codecs are supported by the major browser vendors. You've got be kidding right? I'm so close!
Surely the FFMPEG library has an encoder for H.264? It's meant to be the "future of the web" after all...

Let's check the FFMPEG documentation. After a bit of searching you'll come across that yes, it does. Why on god's green earth can we not use it then?! Unfortunately, that's not a question I can answer. However, with AForge being open source, we have access to the source code and with us being software developers, we can solve such problems! After all we know the the AForge.Video.FFMPEG library is just a wrapper around FFMPEG. Come on, we can do this!

If you open up the AForge.Video.FFMPEG solution after downloading the source code of AForge, the first thing that will hit you is this isn't C# we're looking at... this is Visual C++. Now I haven't touched C++ since University but not to worry, we're only making a few modifications and I'm sure it'll all coming flooding back when we start getting stuck into it.

Now where on earth do we start? We've got a library written in an unfamiliar language which is wrapped around another library that we have absolutely no knowledge of. I could download the source code for FFMPEG but let's cross that bridge if and only if I have to.

First off, we know we need an H264 option under the VideoCodecs enum, so let's add that. Open up VideoCodec.h and you'll see the enum definition. Add H264 to the bottom so it looks something like this:


public enum class VideoCodec {
    Default = -1,
    MPEG4 = 0,
    WMV1 = 1,
    WMV2 = 2,
    MSMPEG4v2 = 3,
    MSMPEG4v3 = 4,
    H263P = 5,
    FLV1 = 6,
    MPEG2 = 7,
    Raw = 8,
    H264 = 9
}


Unsurprisingly, we can't just add an extra option and expect it to work. At some point that enum will be used to actually do something. The first thing it does is to select the actual codec and pixel format to use for the encoding of your video. It does that by looking up the codec and the format from two arrays using the enum value as the position of the item in the array.
These arrays are stored under VideoCodec.cpp. Open that up and you'll see the definition of the video_codecs and pixel_formats array. We just need to add our options in here like so:


int video_codecs[] = 
{
    libffmpeg::CODEC_ID_MPEG4,
    libffmpeg::CODEC_ID_WMV1,
    libffmpeg::CODEC_ID_WMV2,
    libffmpeg::CODEC_ID_MSMPEG4V2,
    libffmpeg::CODEC_ID_MSMPEG4V3,
    libffmpeg::CODEC_ID_H263P,
    libffmpeg::CODEC_ID_FLV1,
    libffmpeg::CODEC_ID_MPEG2VIDEO,
    libffmpeg::CODEC_ID_RAWVIDEO,
    libffmpeg::CODEC_ID_H264
}

int pixel_formats[] =
{
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_YUV420P,
    libffmpeg::PIX_FMT_BGR24,
    libffmpeg::PIX_FMT_YUV420P
}


Now we're getting somewhere. Now when we compile this and add it to our project, when we open up a VideoFileWriter using VideoCodec.H264 as the final parameter, the system finds our codec and tries to encode the video using it. Yes! We're there.

Wrong.

What's the red error appearing in our console window?
"broken ffmpeg default settings detected"

Damn. So close. What's going wrong now? As it turns out, the default settings that FFMPEG set for the H264 codec are a load of rubbish. Nothing is ever easy eh?

More on that in the next blog post...