AWS #Transcribe Speech to Text using C#

AWS has a transcribe service which converts audio containing speech to text. It is very tightly integrated with the AWS ecosystem, so it’s probably best used for systems that are already using AWS for other services – specifically, S3 for storage of audio, and perhaps Cloudwatch and Lambda for post processing.

So, the Transcribe service takes an audio file that is already in an S3 bucket with Amazon, and produces text output, which is placed in another bucket. The process is asynchronous, so it’s best to have another event (i.e. Cloudwatch + Lambda) dealing with the output.

First off, you need the Nuget package “Install-Package AWSSDK.TranscribeService” installed for your project. You should also have your local dev environment setup to access AWS via the CLI (aws configure). You don’t have to do that last step, but the code below assumes you have done this.

var client = new AmazonTranscribeServiceClient( RegionEndpoint.EUWest1);

var job = client.StartTranscriptionJobAsync(new StartTranscriptionJobRequest
{
	LanguageCode = LanguageCode.EnUS,
	Media = new Media
	{
		MediaFileUri = "s3://audioBucket/message.mp3"
	},
	MediaFormat = MediaFormat.Mp3,
	OutputBucketName = "aws.serverless.2",
	TranscriptionJobName = "message"
}).Result;

Here, we specify the input S3 Uri, which is an Mp3 file, in US English. I also specify the output bucket, and the name of the file.

This will run, and return immediately. At some time in the future, a file will appear in the output bucket with contents such as;

{
   "jobName":"hotline2",
   "accountId":"005445879168",
   "results":{
      "transcripts":[
         {
            "transcript":"thank you for Collins in his bank, the first American bank designed for international customers. Please leave your message and we will return your call shortly."
         }
      ],
      "items":[
         {
            "start_time":"0.44",
            "end_time":"0.79",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"thank"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"0.79",
            "end_time":"0.88",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"you"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"0.88",
            "end_time":"1.01",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"for"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"1.01",
            "end_time":"1.52",
            "alternatives":[
               {
                  "confidence":"0.9214",
                  "content":"Collins"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"1.52",
            "end_time":"1.65",
            "alternatives":[
               {
                  "confidence":"0.9884",
                  "content":"in"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"1.65",
            "end_time":"1.82",
            "alternatives":[
               {
                  "confidence":"0.9662",
                  "content":"his"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"1.82",
            "end_time":"2.48",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"bank"
               }
            ],
            "type":"pronunciation"
         },
         {
            "alternatives":[
               {
                  "confidence":"0.0",
                  "content":","
               }
            ],
            "type":"punctuation"
         },
         {
            "start_time":"2.51",
            "end_time":"2.77",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"the"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"2.78",
            "end_time":"3.12",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"first"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.12",
            "end_time":"3.63",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"American"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.63",
            "end_time":"3.99",
            "alternatives":[
               {
                  "confidence":"0.996",
                  "content":"bank"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"4.0",
            "end_time":"4.58",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"designed"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"4.58",
            "end_time":"4.74",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"for"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"4.74",
            "end_time":"5.39",
            "alternatives":[
               {
                  "confidence":"0.9987",
                  "content":"international"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"5.39",
            "end_time":"6.16",
            "alternatives":[
               {
                  "confidence":"0.9995",
                  "content":"customers"
               }
            ],
            "type":"pronunciation"
         },
         {
            "alternatives":[
               {
                  "confidence":"0.0",
                  "content":"."
               }
            ],
            "type":"punctuation"
         },
         {
            "start_time":"6.54",
            "end_time":"6.94",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"Please"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"6.94",
            "end_time":"7.12",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"leave"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"7.12",
            "end_time":"7.26",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"your"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"7.26",
            "end_time":"7.86",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"message"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"7.87",
            "end_time":"8.06",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"and"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"8.06",
            "end_time":"8.17",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"we"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"8.17",
            "end_time":"8.36",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"will"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"8.36",
            "end_time":"8.78",
            "alternatives":[
               {
                  "confidence":"0.5229",
                  "content":"return"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"8.78",
            "end_time":"8.95",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"your"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"8.95",
            "end_time":"9.33",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"call"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"9.34",
            "end_time":"10.05",
            "alternatives":[
               {
                  "confidence":"1.0",
                  "content":"shortly"
               }
            ],
            "type":"pronunciation"
         },
         {
            "alternatives":[
               {
                  "confidence":"0.0",
                  "content":"."
               }
            ],
            "type":"punctuation"
         }
      ]
   },
   "status":"COMPLETED"
}

As you can see from the result, it can make some errors, for example, here it used the world “Collins” instead of “Calling”, so the process is not perfect. However, the word-by-word breakdown is really useful, for any other post-processing you may want to do.

This would be excellent for generating subtitles from movie audio, for example.

Categories: Uncategorized

Using #Delphi to call a JSON based #API with basic authentication.

Delphi is quite an old programming language, and has been around since 1995. It’s not really one of the “cool kids” when it comes to programming. However, there is nothing stopping it interoperating with modern REST APIs, which this blog post will show.

So, first off, I downloaded the community edition of Delphi from Embarcadero, and started, directly after a “Hello World”, to make a request to a service that returns my external IP address. Just a simple HTTP GET, but that’s the first step.

So far so simple. I make a crazy call out to CoInitializeEx which does some COM initialization (COM – oh dear). Anyway, then I create a TidHTTP object, which comes from the Indy Project (https://www.indyproject.org/) and then makes a simple GET request, and prints the output to console.

In the above example, I am using an API from http://www.placaapi.com/ – which is a Brazilian vehicle lookup service. you need a username and password, which can be obtained free of charge from the website.

I create an Authentication object, and set the username and password, then I make a HTTP GET request as before.

After I receive the JSON, I pass this to TJSonObject, and read the Description Node from the JSON – There are plenty more fields here, but this is the easiest to display.

For those who want to copy/paste the code – it’s below;

program helloWorld;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  SysUtils, IdHTTP, ActiveX, IdAuthentication, System.JSON;

var
  HTTP: TIdHTTP;  // Create the Indy HTTP Library https://www.indyproject.org/
  Buffer: String;
  JSonValue:TJSonValue;
  Description: string;
  Registration: string;

begin
  try
    // Enter the car registration here
    Registration :=  'BYP6404';
    // https://docs.microsoft.com/en-us/windows/win32/api/combaseapi/nf-combaseapi-coinitializeex
    CoInitializeEx(nil, COINIT_MULTITHREADED);
    HTTP := TIdHTTP.Create;
    HTTP.Request.BasicAuthentication:= true;
    HTTP.Request.Authentication := TIdBasicAuthentication.Create;
    HTTP.Request.Authentication.Username := '***USERNAME HERE***';
    HTTP.Request.Authentication.Password := '***PASSWORD HERE***';
    // Below is the API endpoint for Brazil, can be replaced with any country.
    Buffer := HTTP.Get('http://www.placaapi.com/api/json.aspx/CheckBrazil/' + Registration);
    //Writeln(Buffer); // Uncomment to see full result
    JsonValue := TJSonObject.ParseJSONValue(Buffer);
    // Read the Vehicle Decription
    Description := JsonValue.GetValue<string>('Description');
    Writeln(Description);
  except
    on E: Exception do
        Writeln(E.ClassName, ': ', E.Message); // Show any error
  end;
  readln; // pause on completion
end.

Categories: Uncategorized

Getting #AWS #Lambda timeout value at runtime in C#

If you have a Lambda function in C#, and you have a process that must complete before Amazon kills the process, say in the classic bank transfer example, where you debit one account and credit another. Both actions need to be taken atomically, or else the process should be rolled back.

In this case, you can’t have AWS killing your lambda function due to lack of time remaining, in mid-transaction. In this case, you’d have one account missing money, and no money in the payee account. In this case, it would be better to get a forewarning, that gives you X seconds to rollback the transaction, so that at least, you’re back to where you started.

So, what’s the magic command:

     var context = (ILambdaContext) HttpContext.Items["LambdaContext"];

This gives you an ILambdaContext Object, that contains data similar to the following;


{
   "FunctionName":"HelloWorld-AspNetCoreFunction-1T4MMT4XYY3IJ",
   "FunctionVersion":"$LATEST",
   "LogGroupName":"/aws/lambda/HelloWorld-AspNetCoreFunction-1T4MMT4XYY3IJ",
   "LogStreamName":"2021/03/01/[$LATEST]36833c138ec040549dfd7aa350f2ca44",
   "MemoryLimitInMB":512,
   "AwsRequestId":"b481c563-a027-43d7-bb8e-841ea0f89cdf",
   "InvokedFunctionArn":"arn:aws:lambda:eu-west-1:005445879168:function:HelloWorld-AspNetCoreFunction-1T4MMT4XYY3IJ",
   "RemainingTime":"00:00:28.6830000",
   "ClientContext":null,
   "Identity":{
      "IdentityId":"",
      "IdentityPoolId":""
   },
   "Logger":{
      
   }
}

Which gives you Remaining Time , and the Memory Limit, along with other values.

For other values relating to the Lambda, you should also check the Environment variables in your code.

Categories: Uncategorized

Bypass Google #Recaptcha using #CapMonster

Google Recaptcha is a system that is designed to stop bots from interacting with a website, and only allow humans. However, with everything, there is always a workaround. In this demo I’m using CapMonster’s API, which works, but in my opinion is quite slow, here a sample request takes 52 seconds. So probably unsuited to real-time processing.

I used the Nuget Package created by “Mohammed Boukhlouf” https://github.com/M-Boukhlouf/CapMonsterCloud and the full source is here; https://github.com/infiniteloopltd/CapMonsterDemo – without my client ID of course.

Here is the jist of the code;

var start = DateTime.Now;
var client = new CapMonsterClient(secret);
var captchaTask = new RecaptchaV3TaskProxyless
{
WebsiteUrl = "https://lessons.zennolab.com/captchas/recaptcha/v3.php?level=beta",
WebsiteKey = "6Le0xVgUAAAAAIt20XEB4rVhYOODgTl00d8juDob",
MinScore = 0.3,
PageAction = "myverify"
};
// Create the task and get the task id
var taskId = client.CreateTaskAsync(captchaTask).Result;
Console.WriteLine("Created task id : " + taskId);
var solution = client.GetTaskResultAsync<RecaptchaV3TaskProxylessResult>(taskId).Result;
// Recaptcha response to be used in the form
var recaptchaResponse = solution.GRecaptchaResponse;

Console.WriteLine("Solution : " + recaptchaResponse);
var web = new WebClient {Encoding = Encoding.UTF8};
web.Headers.Add("content-type","application/x-www-form-urlencoded");
var result = web.UploadString("https://lessons.zennolab.com/captchas/recaptcha/v3_verify.php?level=beta", "token=" + recaptchaResponse);
var idxStart = result.IndexOf("<pre>", StringComparison.Ordinal);
var idxEnd = result.IndexOf("</pre>", StringComparison.Ordinal);
var jsonResult = result.Substring(idxStart, idxEnd - idxStart);
Console.WriteLine(jsonResult);
var end = DateTime.Now;
var duration = end-start;
Console.WriteLine(duration.TotalSeconds);

Categories: Uncategorized

Optimizing index_merge in #MySQL for performance

If you have a query in MySQL that uses an index merge, i.e. that is you are querying on multiple unconnected indexes on the same table, here is a performance tweak, that in my case changed a query from 40 seconds to 0.254 seconds, by reorganising the query with a subquery.

So, My original query was like this:

SELECT *
FROM   data
WHERE  ( l1 = 'no-match' )
        OR ( l2 = 'X'
             AND ( f1 = 'Y'
                    OR p = 'Z' ) ) 

Where “No-Match” is value that didn’t match anything in the table, and X,Y,Z were values that did match something. In the Execution plan, this was being executed as an index_merge, since all the columns had indexes on them, but not connected, but it had a very high cost;

Type Name Cost Rows
table data (index_merge) 92689.48 84079

However, by re-writing the query as a sub-query as follows;

select * from (
	select * from data where 
	(
   		L1 = 'no-match' OR L2 = 'X' 
	)
) data2
where 
  f1 = 'Y' OR p= 'Z' 	

The index_merge was drastically reduced;

Type Name Cost Rows
table data (index_merge) 1189.36 918

And most importantly, the time was reduced to a fraction of the overall cost. I’d also argue that the SQL was a bit easier to read also.

Categories: Uncategorized

Create your own Flash Briefing with #Alexa #Skills

So, this started because I wanted to listen to Italian news (RAI) on an English (UK) Amazon Alexa device, and the news source was not available. With a quick Google, I found someone who released a NodeJS based app that captured the feed from RAI, and reformatted the JSON into a format that is compatible with Alexa. I forked this repo here;

https://github.com/infiniteloopltd/alexa-flash-briefing-grr-radio-rai

Then, using Heroku, I deployed the Github repo onto a temporary domain, which you can see here;

https://flash-briefing.herokuapp.com/

Then, I headed over to Developer.Amazon.com, clicked on Alexa -> Create Alexa Skills -> Console

Then Create Skill -> Enter a Name (Italy News) – > Select a Language (English (UK)) ;

This should match your Alexa device exactly. English US and English UK are different !

Select Flash Briefing -> Create Skill

Add an error message , like “Sorry, failed”

Press “Add new Feed”

Fill in the fields, like preamble, name, Update frequency. The Content type should be audio, and the feed URL should be the Heroku Url above.

Then, from your alexa app in your phone, click More -> Settings -> Flash Briefing, and your new source should be in the list

Now, you’ll have a new news source when you say “Play News” to Alexa

Categories: Uncategorized

Storing temporary data in #Redis in C#

Redis is an in-memory database, designed for ultra-fast data retrieval. It’s great for short-lived data, that perhaps you only need to store for the duration of a user session, or transaction.

The AWS version, of Redis, under the name “ElasticCache” can only be accessed from within the AWS network, or specifically the Redis VPC. This is obviously designed to enforce recommended usage. You get no performance advantage if your Web server needs to traverse the Internet to reach your Redis server.

Here, I’ve used RedisCloud Hosted Redis database, rather than a local installation. But it has the advantage that it can be accessed from anywhere. Good for development, not designed for production. The key is in plain text, feel free to mess about with the server.

So, this was my use case: I wanted to store data temporarily, just for 2 seconds, and then delete it. It’s actually rather non-trivial with a standard MySQL database to do this in a scaleable way.

So, step 1 is to import a client library, here I picked StackExchange.Redis;

Install-Package StackExchange.Redis

Now, I connect to the Redis server, and write a value that will expire in 2 seconds;

const string endpoint = "redis-15557.c72.eu-west-1-2.ec2.cloud.redislabs.com:15557,password=JU455eaOlQZjVYExorUl1oFouO509Ptu";
var redis = ConnectionMultiplexer.Connect(endpoint);
var db = redis.GetDatabase();

const string setValue = "abcdefg";
db.StringSet("mykey", setValue, TimeSpan.FromSeconds(2));
    

If I read the value back instantly, I get the expected value of “abcdefg”. If I wait 3 seconds and try to read again, I get null;

string getValue = db.StringGet("mykey");
Console.WriteLine(getValue); // writes: "abcdefg"
Thread.Sleep(TimeSpan.FromSeconds(3));
string getValue2 = db.StringGet("mykey");
Console.WriteLine(getValue2); // writes nothing

The code is available to clone here; https://github.com/infiniteloopltd/RedisTest

Categories: Uncategorized

Run #postman collections in C# (.NET Core) hosted on AWS Lambda

Major Caveat, this is very much a work in progress, that I hope to complete one day, or that someone will complete, and be nice enough to share the code via a pull request.

But, here’s the repo on github: https://github.com/infiniteloopltd/PostmanSharp

The motivation behind this, was a way to define a Javascript library that could be used to execute Postman collections. Javascript can’t call most APIs, apart from those with CORS enabled, or hosted on the same server. However, Postman offers a nice way to define an API, and export that definition as a Postman collection.

So, what I did, was created a C# library that interprets a Postman collection, and carries out the calling of that API, using variables passed in. So far, It only does HTTP GET requests, but it could be expanded easily.

So, let’s start off with the library; which is defined as follows –

public class Postman
{
	private JObject postmanCollection;
	public Postman(string collection)
	{
	    postmanCollection = JObject.Parse(collection);
	}

	public string Execute(string function, string variables)
	{
	    var item = postmanCollection["item"].FirstOrDefault(
		j => j["name"] + "" == function);
	    if (item == null) throw new ArgumentException("function not recognized");
	    var request = item["request"];
	    var method = request["method"] + "";
	    if (method != "GET") 
		throw new NotImplementedException("Only HTTP GET is currently supported");
	    if (request["header"] is JArray headers && headers.Count > 0) 
		throw new NotImplementedException("HTTP Headers are not yet supported");
	    var url = request["url"]["raw"] + "";
	    var jVariables = JObject.Parse(variables);
	    foreach (var (key,value) in jVariables)
	    {
		url = url.Replace("{{" + key + "}}", value+"");
	    }
	    using var web = new WebClient {Encoding = Encoding.UTF8};
	    var result = web.DownloadString(url);
	    return result;
	}
}

You could use this from a console app, to call a postman-defined API, but to make it more interesting, I created an AWS Lambda function, and configured my API gateway to permit CORS with the following code:

public string FunctionHandler(LambdaRequest request, ILambdaContext context)
{
    string result;
    try
    {
	LambdaLogger.Log("FunctionHandler called");
	LambdaLogger.Log(request.body);
	if (!request.body.StartsWith("{"))
	{
	    request.body = Encoding.UTF8.GetString(Convert.FromBase64String(request.body));
	}
	var jsonBody = JObject.Parse(request.body);
	var collection = jsonBody["collection"] + "";
	var function = jsonBody["function"] + "";
	var variables = jsonBody["variables"] + "";
	var postman = new Postman(collection);
	result = postman.Execute(function, variables);
    }
    catch (Exception ex)
    {
	result = ex.ToString();
    }
    LambdaLogger.Log(result);
    return result; 
}

Where LambdaRequest is defined simply as:

public class LambdaRequest
{
    public string body { get; set; }
}

Now, once the Lambda is uploaded to AWS, and an API gateway, with CORS enabled is set up, then I defined my Javascript class as follows (actual url omitted)

class Postman
{
	constructor(collection)
	{
		this.collection = collection;
	}

	execute(postmanFunction, variables)
	{
		var url = "https://xxxxx.amazonaws.com/default/Postman";
		return this.postData(url, 
			{
			 "collection" : this.collection, 
			 "function" : postmanFunction, 
			 "variables" : JSON.stringify(variables)
			});
	}

	async postData(url = '', data = {}) {
	  const response = await fetch(url, {
		method: 'POST', 
		mode: 'cors', 
		body: JSON.stringify(data) 
	  });
	  return response.json(); 
	}

}

Then, it’s called something like this in the page;

var collection = ... 
var postman = new Postman(collection);
var variables = {
	address : "Dublin, Ireland"
};
postman.execute("Geolocation",variables).then( data => {
	console.log(data);
	var pos = data.Response.View[0].Result[0].Location.NavigationPosition[0];
	alert(pos.Latitude + "," + pos.Longitude);
});

Categories: Uncategorized

Record #Mp4 #H264 video from a webcam in C# (.NET Core)

This absolute masterpiece of a video was created with OpenCV and FFMediaToolkit in C#, to be honest, and you too can create a video like this using the Code posted on the Github repo here –

https://github.com/infiniteloopltd/WebcamDemo

(Fluffy Pink Flamingo not included)

This has thought me quite a bit about the underlying workings of the Bitmap file format, and I’m sure there is a better way to do this, I do welcome comments and suggestions below, but this may be helpful to someone.

Ok, first the basics –

My machine had two webcams, so I wanted to choose between them; therefore I used the DirectShowLib NuGet Package (Install-Package DirectShowLib) as follows –

private static int SelectCameraIndex()
{
	var cameras = DsDevice.GetDevicesOfCat(FilterCategory.VideoInputDevice);
	if (cameras.Length == 1) return 0;
	foreach (var (camera, index) in WithIndex(cameras))
	{
		Console.WriteLine($"{index}:{camera.Name}");
	}
	Console.WriteLine("Select a camera from the list above:");
	var camIndex = Convert.ToInt32(Console.ReadLine());
	return camIndex;
}

The “WithIndex” is a helper function that gives an indexer in a foreach loop, not essential, but elegant;

private static IEnumerable<(T item, int index)> WithIndex<T>(IEnumerable<T> source)
{
	return source.Select((item, index) => (item, index));
}

Now, what I wanted to do is initialize the capture device (the webcam) to capture images, and feed them into a Mp4 video for 5 seconds, then stop.

var camIndex = SelectCameraIndex();
_captureDevice = new VideoCapture(camIndex, VideoCapture.API.DShow)
	{FlipVertical = true};
_captureDevice.ImageGrabbed += CaptureDeviceImageGrabbed;
var settings = new VideoEncoderSettings(width: 
	_captureDevice.Width
	, height: _captureDevice.Height
	, framerate: 15
	, codec: VideoCodec.H264)
{
	EncoderPreset = EncoderPreset.Fast,
	CRF = 17 // Constant Rate Factor
};
// Download from https://github.com/BtbN/FFmpeg-Builds/releases
FFmpegLoader.FFmpegPath =
	@"C:\Users\fiach\source\repos\Webcam\FFmpeg\";
_videoOutput = MediaBuilder.CreateContainer(@"c:\temp\example.mp4").WithVideo(settings).Create();
_captureDevice.Start();
Thread.Sleep(TimeSpan.FromSeconds(5));
_captureDevice.Stop();
_captureDevice.Dispose();
_videoOutput.Dispose();

The FlipVertical setting, I will explain later, but effectively, I am telling the capture device (webcam) to trigger the CaptureImageGrabbed event every time a new image is available.

I am initialising a “Container” which will store it’s output at “C:\temp\example.mp4”. This container will be fed with images from the CaptureImageGrabbed event. The main thread sleeps for 5 seconds, as this capture-encode cycle happens, and once the thread wakes up, it stops the capture device, and cleans up the resources used.

So, lets look at the CaptureImageGrabbed event;

private static void CaptureDeviceImageGrabbed(object sender, System.EventArgs e)
{
	var frame = new Mat();
	_captureDevice.Retrieve(frame);
	var buffer = new VectorOfByte();
	var input = frame.ToImage<Bgr, byte>();
	CvInvoke.Imencode(".bmp", input, buffer);
	var bitmapData = buffer.ToArray();
	
	var headerLessData = RedBlueSwap(bitmapData.Skip(54).ToArray());
	var imageData = ImageData.FromArray(headerLessData, ImagePixelFormat.Rgb24, frame.Size);
	_videoOutput.Video.AddFrame(imageData);
}

This is where we deal with trying to mesh two incompatible image formats. The image captured from the camera, and the image required by the FFMPeg container. There are bound to be better ways to do this, but this is how I did this.

I retrieve the frame data from the camera, and then convert this into a Bitmap file format in memory, and store this in a buffer byte array. I then have to do some rather weird operations to convert the Bitmap file format into a image array used by FFMPeg.

The Bitmap image format has a 54 byte header, which can be removed by calling the Skip(54) method. If you don’t do this step, you get an error saying “‘Pixel buffer size doesn’t match size required by this image format.'”

The Bitmap image format also is written backwards, which means the image is upside down – hence the “FlipVertical” used in the capture device. Since it is backwards, it also means that the Red and Blue colour channels are reversed. This does create an interesting colour effect, and took me a while to work out what was wrong!

This is the code for the Red – Blue Swap;

private static byte[] RedBlueSwap(byte[] input)
{
	var output = new byte[input.Length];
	for (var i = 0; i < input.Length - 3; i += 3)
	{
		var r = input[i];
		var g = input[i + 1];
		var b = input[i + 2];
		output[i] = b;
		output[i + 1] = g;
		output[i + 2] = r;
	}
	return output;
}

And that’s all there was to it. If you run this code, it will take a 5 second video, and store it locally.

You also have access to the image data as the video is being made, so you can adapt this to do motion detection – image recognition, real-time video editing, whatever you need!

This version is for Windows, but the components used do have Linux versions also, so this will be a future project.

Categories: Uncategorized

Capture a #webcam image using .NET Core and #OpenCV

TL;DR; Here is the public github repo: https://github.com/infiniteloopltd/WebcamDemo

Using OpenCV to capture a webcam in .NET is an easy process, you just need to include two Nuget Packages;

Install-Package Emgu.CV 
Install-Package Emgu.CV.runtime.windows

The windows runtime is required if running this code on Windows, you will need to target another platform, then you could use “Emgu.CV.runtime.ubuntu” for instance.

Here, you can use VideoCapture object to capture your video:

static void Main(string[] args)
{
  var filename = "webcam.jpg";
  if (args.Length > 0) filename = args[0];
  using var capture = new VideoCapture(0, VideoCapture.API.DShow); 
  var image = capture.QueryFrame(); //take a picture
  image.Save(filename);
}

The parameters to the VideoCapture constructor are the video source index (0 being default), and the API being used, here DirectShow worked best for me.

Categories: Uncategorized