Home > Uncategorized > #OCR using #Azure Cognitive services

#OCR using #Azure Cognitive services

OCR

OCR or Optical Character Recognition is the process of extracting text from an Image. Microsoft Azure offers a service within Azure, called “Computer Vision”, which offers a free tier, that you can use to run small batches of OCR on images.

Here’s some sample code to use it in C#. I’ve used the Nuget package Newtonsoft.JSON for Json processing. I’ve also omitted the key, which you can get from Azure

private static string OcrUsingAzure(string url)
{
const string strUrl = “https://westeurope.api.cognitive.microsoft.com/vision/v1.0/ocr?language=unk&detectOrientation=true&enhanced=True”;
var wc = new WebClient();
wc.Headers[“Ocp-Apim-Subscription-Key”] = “xxxxxxx”;
var jPost = new { url = url };
var strPost = JsonConvert.SerializeObject(jPost, Formatting.Indented);
var strJson = wc.UploadString(strUrl, “POST”, strPost);
var jObject = JObject.Parse(strJson);
var strOutput = “”;
foreach (var region in jObject[“regions”])
{
foreach (var line in region[“lines”])
{
foreach (var word in line[“words”])
{
strOutput += word[“text”] + ” “;
}
strOutput += Environment.NewLine;
}
}return strOutput.Trim();
}

You pass in a url of an image with some text, and it spits out the text the other side.

If you know in advance the language of the document, i.e. english, you can improve the accuracy by changing the language parameter in the Querystring.

 

Advertisements
Categories: Uncategorized
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: