Home > Uncategorized > Record #Mp4 #H264 video from a webcam in C# (.NET Core)

Record #Mp4 #H264 video from a webcam in C# (.NET Core)

This absolute masterpiece of a video was created with OpenCV and FFMediaToolkit in C#, to be honest, and you too can create a video like this using the Code posted on the Github repo here –


(Fluffy Pink Flamingo not included)

This has thought me quite a bit about the underlying workings of the Bitmap file format, and I’m sure there is a better way to do this, I do welcome comments and suggestions below, but this may be helpful to someone.

Ok, first the basics –

My machine had two webcams, so I wanted to choose between them; therefore I used the DirectShowLib NuGet Package (Install-Package DirectShowLib) as follows –

private static int SelectCameraIndex()
	var cameras = DsDevice.GetDevicesOfCat(FilterCategory.VideoInputDevice);
	if (cameras.Length == 1) return 0;
	foreach (var (camera, index) in WithIndex(cameras))
	Console.WriteLine("Select a camera from the list above:");
	var camIndex = Convert.ToInt32(Console.ReadLine());
	return camIndex;

The “WithIndex” is a helper function that gives an indexer in a foreach loop, not essential, but elegant;

private static IEnumerable<(T item, int index)> WithIndex<T>(IEnumerable<T> source)
	return source.Select((item, index) => (item, index));

Now, what I wanted to do is initialize the capture device (the webcam) to capture images, and feed them into a Mp4 video for 5 seconds, then stop.

var camIndex = SelectCameraIndex();
_captureDevice = new VideoCapture(camIndex, VideoCapture.API.DShow)
	{FlipVertical = true};
_captureDevice.ImageGrabbed += CaptureDeviceImageGrabbed;
var settings = new VideoEncoderSettings(width: 
	, height: _captureDevice.Height
	, framerate: 15
	, codec: VideoCodec.H264)
	EncoderPreset = EncoderPreset.Fast,
	CRF = 17 // Constant Rate Factor
// Download from https://github.com/BtbN/FFmpeg-Builds/releases
FFmpegLoader.FFmpegPath =
_videoOutput = MediaBuilder.CreateContainer(@"c:\temp\example.mp4").WithVideo(settings).Create();

The FlipVertical setting, I will explain later, but effectively, I am telling the capture device (webcam) to trigger the CaptureImageGrabbed event every time a new image is available.

I am initialising a “Container” which will store it’s output at “C:\temp\example.mp4”. This container will be fed with images from the CaptureImageGrabbed event. The main thread sleeps for 5 seconds, as this capture-encode cycle happens, and once the thread wakes up, it stops the capture device, and cleans up the resources used.

So, lets look at the CaptureImageGrabbed event;

private static void CaptureDeviceImageGrabbed(object sender, System.EventArgs e)
	var frame = new Mat();
	var buffer = new VectorOfByte();
	var input = frame.ToImage<Bgr, byte>();
	CvInvoke.Imencode(".bmp", input, buffer);
	var bitmapData = buffer.ToArray();
	var headerLessData = RedBlueSwap(bitmapData.Skip(54).ToArray());
	var imageData = ImageData.FromArray(headerLessData, ImagePixelFormat.Rgb24, frame.Size);

This is where we deal with trying to mesh two incompatible image formats. The image captured from the camera, and the image required by the FFMPeg container. There are bound to be better ways to do this, but this is how I did this.

I retrieve the frame data from the camera, and then convert this into a Bitmap file format in memory, and store this in a buffer byte array. I then have to do some rather weird operations to convert the Bitmap file format into a image array used by FFMPeg.

The Bitmap image format has a 54 byte header, which can be removed by calling the Skip(54) method. If you don’t do this step, you get an error saying “‘Pixel buffer size doesn’t match size required by this image format.'”

The Bitmap image format also is written backwards, which means the image is upside down – hence the “FlipVertical” used in the capture device. Since it is backwards, it also means that the Red and Blue colour channels are reversed. This does create an interesting colour effect, and took me a while to work out what was wrong!

This is the code for the Red – Blue Swap;

private static byte[] RedBlueSwap(byte[] input)
	var output = new byte[input.Length];
	for (var i = 0; i < input.Length - 3; i += 3)
		var r = input[i];
		var g = input[i + 1];
		var b = input[i + 2];
		output[i] = b;
		output[i + 1] = g;
		output[i + 2] = r;
	return output;

And that’s all there was to it. If you run this code, it will take a 5 second video, and store it locally.

You also have access to the image data as the video is being made, so you can adapt this to do motion detection – image recognition, real-time video editing, whatever you need!

This version is for Windows, but the components used do have Linux versions also, so this will be a future project.

Categories: Uncategorized
  1. December 30, 2020 at 2:22 pm

    And this is my 1,000th post on this blog 🙂


  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: