Cognitive Services : Creation of Language Understanding Intelligent Service or LUIS

May 13, 2020May 13, 2020Rajeesh M R ( RajeeshMenoth ) Leave a comment

Introduction

One of the important api in Cognitive Services is Language Understanding Intelligent Service or LUIS and it is a natural language processing service that enable us to understand human language in our own application, website, chatbot, IoT device, etc. Once configure , train and publish your LUIS model, then the application can easily receive user input in natural language and take an appropriate action based on our intent,utterances,entity configuration in LUIS Model.
We can easily create LUIS model with the help of LUIS account and for this we require LUIS “Authoring resource” in Azure. So first of all, we need to create a LUIS API in Azure account using valid subscription key.

You can also refer the following articles on Cognitive Service.

Prerequisites

Subscription key ( Azure Portal ) or Trail Subscription Key
Visual Studio 2015 or 2017 0r 2019

Create LUIS in Azure

Step 1 : Click on “+” icon -> go to “AI + Machine Learning” -> Click on “See all”.

Step 2 : Go to Cognitive Services and click on “See more”

Step 3 : Click on “Language Understanding”

Step 3 : The following screen will appear once you click on “Create” button in “Language Understanding”

Subscription : We can select our Azure subscription for Language Understanding.
Resource group : We can create a new resource group or choose from an existing one ( We select our existing resource group as “luis-test” ).
Authoring location : The best thing is we can choose a location closest to our customer needs.
Authoring pricing tier : We can choose the appropriate pricing tier as per our needs.
Prediction location : The best thing is we can choose a location closest to our customer needs.
Prediction pricing tier : As of now there are two pricing tier available “FO” & “SO” and obviously “FO” is the free one and better we can choose the appropriate pricing tier as per our needs.

7. Click on the “Review + Create” button and wait for the build success.

8. Once the build is succeeded, then click on the “Dashboard” and we can see “luis-cog-testing” is created in the All resources list . LUIS is ready for use !!.

Create Luis Application

We have already created the LUIS authoring resource in azure and now we can easily create LUIS model in Luis account. So go to LUIS account and create a new LUIS App.

Name : Name of the luis application.
Culture : The current culture or language going to use in Luis application.
Description : A short description of our application.
Authoring resource : The resource that we have created in azure.
Prediction resource : The resource that we have created in azure.

Output :

The app is successfully created in LUIS and by default it will contain one “Intent” called as “None”. We will discuss in details of intent , utterances , entity in the upcoming articles.

Reference

LUIS Documentation

Summary

From this article we have learned how to create LUIS model with the help of LUIS account and LUIS “Authoring resource” in Azure . I hope this article is useful for all beginners.

Cognitive Services : Convert Text to Speech in multiple languages using Asp.Net Core & C#

October 17, 2018June 3, 2020Rajeesh M R ( RajeeshMenoth ) 1 Comment

Introduction

In this article, we are going to learn how to convert text to speech in multiple languages using one of the important Cognitive Services API called Microsoft Text to Speech Service API ( One of the API in Speech API ). The Text to Speech (TTS) API of the Speech service converts input text into natural-sounding speech (also called as speech synthesis). It supports text in multiple languages and gender based voice(male or female)

You can also refer the following articles on Cognitive Service.

Prerequisites

Subscription key ( Azure Portal ) or Trail Subscription Key
Visual Studio 2015 or 2017

Convert Text to Speech API

First, we need to log into the Azure Portal with our Azure credentials. Then we need to create an Azure Speech Service API in the Azure portal.
So please click on the “Create a resource” on the left top menu and search “Speech” in the search bar on the right side window or top of Azure Marketplace.

Now we can see there are few speech related “AI + Machine Learning ” categories listed in the search result.

Click on the “create” button to create Speech Service API.

Provision a Speech Service API ( Text to Speech ) Subscription Key

After clicking the “Create”, It will open another window. There we need to provide the basic information about Speech API.

Name : Name of the Translator Text API ( Eg. TextToSpeechApp ).

Subscription : We can select our Azure subscription for Speech API creation.

Location : We can select location of resource group. The best thing is we can choose a location closest to our customer.

Pricing tier : Select an appropriate pricing tier for our requirement.

Resource group : We can create a new resource group or choose from an existing one ( We created a new resource group as “SpeechResource” ).

Now click on the “TextToSpeechApp” in dashboard page and it will redirect to the detailed page of TextToSpeechApp ( “Overview” ). Here, we can see the “Keys” ( Subscription key details ) menu in the left side panel. Then click on the “Keys” menu and it will open the Subscription Keys details. We can use any of the subscription keys or regenerate the given key for text to speech conversion using Microsoft Speech Service API.

Authentication

A token ( bearer ) based authentication is required in the Text To Speech conversion using Speech Service API. So we need to create an authentication token using “TextToSpeechApp” subscription keys. The following “endPoint” will help to create an authentication token for Text to speech conversion. The each access token is valid for 10 minutes and after that we need to create a new one for the next process.

“https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken”

Speech Synthesis Markup Language ( SSML )

The Speech Synthesis Markup Language (SSML) is an XML-based markup language that provides a way to control the pronunciation and rhythm of text-to-speech. More about SSML ..

SSML Format :

<speak version='1.0' xml:lang='en-US'><voice xml:lang='ta-IN' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (ta-IN, Valluvar)'>
        நன்றி
</voice></speak>

How to make a request

This is very simple process, HTTP request is made in POST method. So that means we need to pass secure data in the request body and that will be a plain text or a SSML document. As per the documentation,it is clearly mentioned in most cases that we need to use SSML body as request. The maximum length of the HTTP request body is 1024 characters and the following is the endPoint for our http Post method.

“https://westus.tts.speech.microsoft.com/cognitiveservices/v1”

The following are the HTTP headers required in the request body.

Pic source : https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-text-to-speech

Index.html

The following html contains the binding methodology that we have used in our application by using the latest Tag helpers of ASP.Net Core.

Model

The following model contains the Speech Model information.

using Microsoft.AspNetCore.Mvc.Rendering;
using System.Collections.Generic;
using System.ComponentModel;

namespace TextToSpeechApp.Models
{
    public class SpeechModel
    {
        public string Content { get; set; }

        public string SubscriptionKey { get; set; } = "< Subscription Key >";

        [DisplayName("Language Selection :")]
        public string LanguageCode { get; set; } = "NA";

        public List<SelectListItem> LanguagePreference { get; set; } = new List<SelectListItem>
        {
        new SelectListItem { Value = "NA", Text = "-Select-" },
        new SelectListItem { Value = "en-US", Text = "English (United States)"  },
        new SelectListItem { Value = "en-IN", Text = "English (India)"  },
        new SelectListItem { Value = "ta-IN", Text = "Tamil (India)"  },
        new SelectListItem { Value = "hi-IN", Text = "Hindi (India)"  },
        new SelectListItem { Value = "te-IN", Text = "Telugu (India)"  }
        };
    }
}

Interface

The “ITextToSpeech” contains one signature for converting text to speech based on the given input. So we have injected this interface in the ASP.NET Core “Startup.cs” class as a “AddTransient”.

using System.Threading.Tasks;

namespace TextToSpeechApp.BusinessLayer.Interface
{
    public interface ITextToSpeech
    {
        Task<byte[]> TranslateText(string token, string key, string content, string lang);
    }
}

Text to Speech API Service

We can add the valid Speech API Subscription key and authentication token into the following code.

/// 

<summary>
        /// Translate text to speech
        /// </summary>


        /// <param name="token">Authentication token</param>
        /// <param name="key">Azure subscription key</param>
        /// <param name="content">Text content for speech</param>
        /// <param name="lang">Speech conversion language</param>
        /// <returns></returns>
        public async Task<byte[]> TranslateText(string token, string key, string content, string lang)
        {
            //Request url for the speech api.
            string uri = "https://westus.tts.speech.microsoft.com/cognitiveservices/v1";
            //Generate Speech Synthesis Markup Language (SSML) 
            var requestBody = this.GenerateSsml(lang, "Female", this.ServiceName(lang), content);

            using (var client = new HttpClient())
            using (var request = new HttpRequestMessage())
            {
                request.Method = HttpMethod.Post;
                request.RequestUri = new Uri(uri);
                request.Headers.Add("Ocp-Apim-Subscription-Key", key);
                request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", token);
                request.Headers.Add("X-Microsoft-OutputFormat", "audio-16khz-64kbitrate-mono-mp3");
                request.Content = new StringContent(requestBody, Encoding.UTF8, "text/plain");
                request.Content.Headers.Remove("Content-Type");
                request.Content.Headers.Add("Content-Type", "application/ssml+xml");
                request.Headers.Add("User-Agent", "TexttoSpeech");
                var response = await client.SendAsync(request);
                var httpStream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false);
                Stream receiveStream = httpStream;
                byte[] buffer = new byte[32768];

                using (Stream stream = httpStream)
                {
                    using (MemoryStream ms = new MemoryStream())
                    {
                        byte[] waveBytes = null;
                        int count = 0;
                        do
                        {
                            byte[] buf = new byte[1024];
                            count = stream.Read(buf, 0, 1024);
                            ms.Write(buf, 0, count);
                        } while (stream.CanRead && count > 0);

                        waveBytes = ms.ToArray();

                        return waveBytes;
                    }
                }
            }
        }

Download

Source Code

Demo

Output

The given text is converted into speech in desired language listed in a drop-down list using Microsoft Speech API.

Reference

See Also

You can download other source codes from MSDN Code, using the link, mentioned below.

Summary

From this article we have learned how to convert text to speech in multiple languages using Asp.Net Core & C# as per the API documentation using one of the important Cognitive Services API ( Text to Speech API is a part of Speech API ). I hope this article is useful for all Azure Cognitive Services API beginners.

Cognitive Services : Translate Text into multiple languages using Asp.Net Core & C#

September 30, 2018Rajeesh M R ( RajeeshMenoth ) Leave a comment

Introduction

In this article, we are going to learn how to translate text into multiple languages using one of the important Cognitive Services API called Microsoft Translate Text API ( One of the API in Language API ). It’s a simple cloud-based machine translation service and obviously we can test through simple Rest API call. Microsoft is using a new standard for high-quality AI-powered machine translations known as Neural Machine Translation (NMT).

Pic source : https://www.microsoft.com/en-us/translator/business/machine-translation/#whatmachine

You can also refer the following articles on Cognitive Service.

Prerequisites

Subscription key ( Azure Portal ).
Visual Studio 2015 or 2017

Translator Text API

First, we need to log into the Azure Portal with our Azure credentials. Then we need to create an Azure Translator Text API in the Azure portal. So please click on the “Create a resource” on the left top menu and search “Translator Text” in the search bar on the right side window or top of Azure Marketplace.

Click on the “create” button to create Translator Text API.

Provision a Translator Text Subscription Key

After clicking the “Create”, It will open another window. There we need to provide the basic information about Translator Text API.

Name : Name of the Translator Text API ( Eg. TranslatorTextApp ).

Subscription : We can select our Azure subscription for Translator Text API creation.

Location : We can select our location of resource group. The best thing is we can choose a location closest to our customer.

Pricing tier : Select an appropriate pricing tier for our requirement.

Resource group : We can create a new resource group or choose from an existing one.

Now click on the “TranslatorTextApp” in dashboard page and it will redirect to the detailed page of TranslatorTextApp ( “Overview” ). Here, we can see the “Keys” ( Subscription key details ) menu in the left side panel. Then click on the “Keys” menu and it will open the Subscription Keys details. We can use any of the subscription keys or regenerate the given key for text translation using Microsoft Translator Text API.

Language Request URL

The following request url gets the set of languages currently supported by other operations of the Microsoft Translator Text API.

https://api.cognitive.microsofttranslator.com/languages?api-version=3.0

Endpoint

The version of the API requested by the client and the Value must be 3.0 and also we can include query parameters and request header in the following endPoint used in our application.

https://api.cognitive.microsofttranslator.com/translate?api-version=3.0

Mandatory required parameters in the query string are “api-version” and “to” . The “api-version” value must be “3.0” as per the current documentation. “to” is the language code parameter used for translating the entered text into the desired language.

The mandatory request headers are “authorization header” and “Content-Type”. We can pass our subscription key into the “authorization header” and the simplest way is to pass our Azure secret key to the Translator service using request header “Ocp-Apim-Subscription-Key”.

Index.html

The following html contains the binding methodology that we have used in our application by using the latest Tag helpers of ASP.Net Core.

site.js

The following ajax call will trigger for each drop-down index change in the language selection using drop-down list.

// Write your JavaScript code.
$(function () {
    $(document)
        .on('change', '#ddlLangCode', function () {
            var languageCode = $(this).val();
            var enterText = $("#enterText").val();
            if (1 <= $("#enterText").val().trim().length && languageCode != "NA") {

                $('#enterText').removeClass('redBorder');

                var url = '/Home/Index';
                var dataToSend = { "LanguageCode": languageCode, "Text": enterText };
                dataType: "json",
                    $.ajax({
                        url: url,
                        data: dataToSend,
                        type: 'POST',
                        success: function (response) {
                            //update control on View
                            var result = JSON.parse(response);
                            var translatedText = result[0].translations[0].text;
                            $('#translatedText').val(translatedText);
                        }
                    })
            }
            else {
                $('#enterText').addClass('redBorder');
                $('#translatedText').val("");
            }
        });
});

Interface

The “ITranslateText” contains one signature for translating text content based on the given input. So we have injected this interface in the ASP.NET Core “Startup.cs” class as a “AddTransient”.

using System.Threading.Tasks;

namespace TranslateTextApp.Business_Layer.Interface
{
    interface ITranslateText
    {
        Task<string> Translate(string uri, string text, string key);
    }
}

Translator Text API Service

We can add the valid Translator Text API Subscription Key into the following code.

using Newtonsoft.Json;
using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using TranslateTextApp.Business_Layer.Interface;

namespace TranslateTextApp.Business_Layer
{
    public class TranslateTextService : ITranslateText
    {
        /// 
<summary>
        /// Translate the given text in to selected language.
        /// </summary>

        /// <param name="uri">Request uri</param>
        /// <param name="text">The text is given for translation</param>
        /// <param name="key">Subscription key</param>
        /// <returns></returns>
        public async Task<string> Translate(string uri, string text, string key)
        {
            System.Object[] body = new System.Object[] { new { Text = text } };
            var requestBody = JsonConvert.SerializeObject(body);
            
            using (var client = new HttpClient())
            using (var request = new HttpRequestMessage())
            {
                request.Method = HttpMethod.Post;
                request.RequestUri = new Uri(uri);
                request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
                request.Headers.Add("Ocp-Apim-Subscription-Key", key);

                var response = await client.SendAsync(request);
                var responseBody = await response.Content.ReadAsStringAsync();
                dynamic result = JsonConvert.SerializeObject(JsonConvert.DeserializeObject(responseBody), Formatting.Indented);
                
                return result;
            }
        }
    }
}

API Response – Based on the given text

The successful json response.

[
  {
    "detectedLanguage": {
      "language": "en",
      "score": 1.0
    },
    "translations": [
      {
        "text": "सफलता का कोई शार्टकट नहीं होता",
        "to": "hi"
      }
    ]
  }
]

Download

Source Code

Output

The given text is translated into desired language listed in a drop-down list using Microsoft Translator API.

Reference

See Also

You can download other source codes from MSDN Code, using the link, mentioned below.

Summary

From this article we have learned translate a text(typed in english) in to different languages as per the API documentation using one of the important Cognitive Services API ( Translator Text API is a part of Language API ). I hope this article is useful for all Azure Cognitive Services API beginners.

Cognitive Services – Optical Character Recognition (OCR) from an image using Computer Vision API And C#

July 13, 2018June 3, 2020Rajeesh M R ( RajeeshMenoth ) 1 Comment

Introduction

In our previous article we learned how to Analyze an Image Using Computer Vision API With ASP.Net Core & C#. In this article we are going to learn how to extract printed text also known as optical character recognition (OCR) from an image using one of the important Cognitive Services API called as Computer Vision API. So we need a valid subscription key for accessing this feature in an image.

Optical Character Recognition (OCR)

Optical Character Recognition (OCR) detects text in an image and extracts the recognized characters into a machine-usable character stream.

Prerequisites

Subscription key ( Azure Portal ).
Visual Studio 2015 or 2017

Subscription Key Free Trail

If you don’t have Microsoft Azure Subscription and want to test the Computer Vision API because it requires a valid Subscription key for processing the image information. Don’t worry !! Microsoft gives a 7 day trial Subscription Key ( Click here ). We can use that Subscription key for testing purposes. If you sign up using the Computer Vision free trial, then your subscription keys are valid for the westcentral region (https://westcentralus.api.cognitive.microsoft.com ).

Requirements

These are the major requirements mentioned in the Microsoft docs.

Supported input methods: Raw image binary in the form of an application/octet stream or image URL.
Supported image formats: JPEG, PNG, GIF, BMP.
Image file size: Less than 4 MB.
Image dimension: Greater than 50 x 50 pixels.

Computer Vision API

First, we need to log into the Azure Portal with our Azure credentials. Then we need to create an Azure Computer Vision Subscription Key in the Azure portal.

Click on “Create a resource” on the left side menu and it will open an “Azure Marketplace”. There, we can see the list of services. Click “AI + Machine Learning” then click on the “Computer Vision”.

Provision a Computer Vision Subscription Key

After clicking the “Computer Vision”, It will open another section. There, we need to provide the basic information about Computer Vision API.

Name : Name of the Computer Vision API ( Eg. OCRApp ).

Subscription : We can select our Azure subscription for Computer Vision API creation.

Location : We can select our location of resource group. The best thing is we can choose a location closest to our customer.

Pricing tier : Select an appropriate pricing tier for our requirement.

Resource group : We can create a new resource group or choose from an existing one.

Now click on the “OCRApp” in dashboard page and it will redirect to the details page of OCRApp ( “Overview” ). Here, we can see the Manage Key ( Subscription key details ) & Endpoint details. Click on the Show access keys links and it will redirect to another page.

We can use any of the subscription keys or regenerate the given key for getting image information using Computer Vision API.

Endpoint

As we mentioned above the location is the same for all the free trial Subscription Keys. In Azure we can choose available locations while creating a Computer Vision API. We have used the following endpoint in our code.

https://westus.api.cognitive.microsoft.com/vision/v1.0/ocr

View Model

The following model will contain the API image response information.

using System.Collections.Generic;

namespace OCRApp.Models
{
    public class Word
    {
        public string boundingBox { get; set; }
        public string text { get; set; }
    }

    public class Line
    {
        public string boundingBox { get; set; }
        public List<Word> words { get; set; }
    }

    public class Region
    {
        public string boundingBox { get; set; }
        public List<Line> lines { get; set; }
    }

    public class ImageInfoViewModel
    {
        public string language { get; set; }
        public string orientation { get; set; }
        public int textAngle { get; set; }
        public List<Region> regions { get; set; }
    }
}

Request URL

We can add additional parameters or request parameters ( optional ) in our API “endPoint” and it will provide more information for the given image.

https://[location].api.cognitive.microsoft.com/vision/v1.0/ocr[?language][&detectOrientation ]

Request parameters

These are the following optional parameters available in computer vision API.

language
detectOrientation

language

The service will detect 26 languages of the text in the image and It will contain “unk” as the default value. That means the service will auto detect the language of the text in the image.

The following are the supported language mention in the Microsoft API documentation.

unk (AutoDetect)
en (English)
zh-Hans (ChineseSimplified)
zh-Hant (ChineseTraditional)
cs (Czech)
da (Danish)
nl (Dutch)
fi (Finnish)
fr (French)
de (German)
el (Greek)
hu (Hungarian)
it (Italian)
ja (Japanese)
ko (Korean)
nb (Norwegian)
pl (Polish)
pt (Portuguese,
ru (Russian)
es (Spanish)
sv (Swedish)
tr (Turkish)
ar (Arabic)
ro (Romanian)
sr-Cyrl (SerbianCyrillic)
sr-Latn (SerbianLatin)
sk (Slovak)

detectOrientation

This will detect the text orientation in the image, for this feature we need to add detectOrientation=true in the service url or Request url as we discussed earlier.

Vision API Service

The following code will process and generate image information using Computer Vision API and its response is mapped into the “ImageInfoViewModel”. We can add the valid Computer Vision API Subscription Key into the following code.

using Newtonsoft.Json;
using OCRApp.Models;
using System;
using System.Collections.Generic;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;

namespace OCRApp.Business_Layer
{
    public class VisionApiService
    {
        // Replace <Subscription Key> with your valid subscription key.
        const string subscriptionKey = "<Subscription Key>";

        // You must use the same region in your REST call as you used to
        // get your subscription keys. The paid subscription keys you will get
        // it from microsoft azure portal.
        // Free trial subscription keys are generated in the westcentralus region.
        // If you use a free trial subscription key, you shouldn't need to change
        // this region.
        const string endPoint =
            "https://westus.api.cognitive.microsoft.com/vision/v1.0/ocr";

        /// 
<summary>
        /// Gets the text visible in the specified image file by using
        /// the Computer Vision REST API.
        /// </summary>

        public async Task<string> MakeOCRRequest()
        {
            string imageFilePath = @"C:\Users\rajeesh.raveendran\Desktop\bill.jpg";
            var errors = new List<string>();
            string extractedResult = "";
            ImageInfoViewModel responeData = new ImageInfoViewModel();

            try
            {
                HttpClient client = new HttpClient();

                // Request headers.
                client.DefaultRequestHeaders.Add(
                    "Ocp-Apim-Subscription-Key", subscriptionKey);

                // Request parameters.
                string requestParameters = "language=unk&detectOrientation=true";

                // Assemble the URI for the REST API Call.
                string uri = endPoint + "?" + requestParameters;

                HttpResponseMessage response;

                // Request body. Posts a locally stored JPEG image.
                byte[] byteData = GetImageAsByteArray(imageFilePath);

                using (ByteArrayContent content = new ByteArrayContent(byteData))
                {
                    // This example uses content type "application/octet-stream".
                    // The other content types you can use are "application/json"
                    // and "multipart/form-data".
                    content.Headers.ContentType =
                        new MediaTypeHeaderValue("application/octet-stream");

                    // Make the REST API call.
                    response = await client.PostAsync(uri, content);
                }

                // Get the JSON response.
                string result = await response.Content.ReadAsStringAsync();

                //If it is success it will execute further process.
                if (response.IsSuccessStatusCode)
                {
                    // The JSON response mapped into respective view model.
                    responeData = JsonConvert.DeserializeObject<ImageInfoViewModel>(result,
                        new JsonSerializerSettings
                        {
                            NullValueHandling = NullValueHandling.Include,
                            Error = delegate (object sender, Newtonsoft.Json.Serialization.ErrorEventArgs earg)
                            {
                                errors.Add(earg.ErrorContext.Member.ToString());
                                earg.ErrorContext.Handled = true;
                            }
                        }
                    );

                    var linesCount = responeData.regions[0].lines.Count;
                    for (int i = 0; i < linesCount; i++)
                    {
                        var wordsCount = responeData.regions[0].lines[i].words.Count;
                        for (int j = 0; j < wordsCount; j++)
                        {
                            //Appending all the lines content into one.
                            extractedResult += responeData.regions[0].lines[i].words[j].text + " ";
                        }
                        extractedResult += Environment.NewLine;
                    }

                }
            }
            catch (Exception e)
            {
                Console.WriteLine("\n" + e.Message);
            }
            return extractedResult;
        }

        /// 
<summary>
        /// Returns the contents of the specified file as a byte array.
        /// </summary>

        /// <param name="imageFilePath">The image file to read.</param>
        /// <returns>The byte array of the image data.</returns>
        static byte[] GetImageAsByteArray(string imageFilePath)
        {
            using (FileStream fileStream =
                new FileStream(imageFilePath, FileMode.Open, FileAccess.Read))
            {
                BinaryReader binaryReader = new BinaryReader(fileStream);
                return binaryReader.ReadBytes((int)fileStream.Length);
            }
        }
    }

}

API Response – Based on the given Image

The successful json response.

{
  "language": "en",
  "orientation": "Up",
  "textAngle": 0,
  "regions": [
    {
      "boundingBox": "306,69,292,206",
      "lines": [
        {
          "boundingBox": "306,69,292,24",
          "words": [
            {
              "boundingBox": "306,69,17,19",
              "text": "\"I"
            },
            {
              "boundingBox": "332,69,45,19",
              "text": "Will"
            },
            {
              "boundingBox": "385,69,88,24",
              "text": "Always"
            },
            {
              "boundingBox": "482,69,94,19",
              "text": "Choose"
            },
            {
              "boundingBox": "585,74,13,14",
              "text": "a"
            }
          ]
        },
        {
          "boundingBox": "329,100,246,24",
          "words": [
            {
              "boundingBox": "329,100,56,24",
              "text": "Lazy"
            },
            {
              "boundingBox": "394,100,85,19",
              "text": "Person"
            },
            {
              "boundingBox": "488,100,24,19",
              "text": "to"
            },
            {
              "boundingBox": "521,100,32,19",
              "text": "Do"
            },
            {
              "boundingBox": "562,105,13,14",
              "text": "a"
            }
          ]
        },
        {
          "boundingBox": "310,131,284,19",
          "words": [
            {
              "boundingBox": "310,131,95,19",
              "text": "Difficult"
            },
            {
              "boundingBox": "412,131,182,19",
              "text": "Job....Because"
            }
          ]
        },
        {
          "boundingBox": "326,162,252,24",
          "words": [
            {
              "boundingBox": "326,162,31,19",
              "text": "He"
            },
            {
              "boundingBox": "365,162,44,19",
              "text": "Will"
            },
            {
              "boundingBox": "420,162,52,19",
              "text": "Find"
            },
            {
              "boundingBox": "481,167,28,14",
              "text": "an"
            },
            {
              "boundingBox": "520,162,58,24",
              "text": "Easy"
            }
          ]
        },
        {
          "boundingBox": "366,193,170,24",
          "words": [
            {
              "boundingBox": "366,193,52,24",
              "text": "way"
            },
            {
              "boundingBox": "426,193,24,19",
              "text": "to"
            },
            {
              "boundingBox": "459,193,33,19",
              "text": "Do"
            },
            {
              "boundingBox": "501,193,35,19",
              "text": "It!\""
            }
          ]
        },
        {
          "boundingBox": "462,256,117,19",
          "words": [
            {
              "boundingBox": "462,256,37,19",
              "text": "Bill"
            },
            {
              "boundingBox": "509,256,70,19",
              "text": "Gates"
            }
          ]
        }
      ]
    }
  ]
}

Download

Source Code

Output

Optical Character Recognition (OCR) from an image using Computer Vision API.

Reference

See Also

You can download other ASP.NET Core source codes from MSDN Code, using the link, mentioned below.

Summary

From this article we have learned Optical Character Recognition (OCR) from an image using One of the important Cognitive Services API ( Computer Vision API ). I hope this article is useful for all Azure Cognitive Services API beginners.

Cognitive Services : Analyze an Image Using Computer Vision API With ASP.Net Core & C#

May 29, 2018July 13, 2018Rajeesh M R ( RajeeshMenoth ) Leave a comment

Introduction

One of the important Cognitive Services API is Computer Vision API and it helps to access the advanced algorithms for processing images and returning valuable information. For example By uploading an image or specifying an image URL, Microsoft Computer Vision algorithms can analyze visual content in different ways based on inputs and user choices. So we will get various information about the given image. We need a valid subscription key for accessing this feature.

Prerequisites

Subscription key ( Azure Portal ).
Visual Studio 2015 or 2017

Subscription Key Free Trail

If you don’t have Microsoft Azure Subscription and want to test the Computer Vision API because it requires a valid Subscription key for processing the image information. Don’t worry !! Microsoft gives a 7 day’s trail Subscription Key ( Click here ) . We can use that Subscription key for testing purpose. If you sign up using the Computer Vision free trial, Then your subscription keys are valid for the westcentral region ( https://westcentralus.api.cognitive.microsoft.com )

Requirements

These are the major requirements mention in the Microsoft docs.

Supported input methods: Raw image binary in the form of an application/octet stream or image URL.
Supported image formats: JPEG, PNG, GIF, BMP.
Image file size: Less than 4 MB.
Image dimension: Greater than 50 x 50 pixels.

Computer Vision API

First, we need to log into the Azure Portal with our Azure credentials. Then we need to create an Azure Computer Vision Subscription Key in the Azure portal.

Provision a Computer Vision Subscription Key

After clicking the “Computer Vision”, it will open another section. There, we need to provide the basic information about Computer Vision API.

Name : Name of the Computer Vision API.

Subscription : We can select our Azure subscription for Computer Vision API creation.

Location : We can select our location of resource group. The best thing is we can choose a location closest to our customer.

Pricing tier : Select an appropriate pricing tier for our requirement.

Resource group : We can create a new resource group or choose from an existing one.

Now click on the MenothVision in dashboard page and it will redirect to the details page of MenothVision ( “Overview” ). Here, we can see the Manage Key ( Subscription key details ) & Endpoint details. Click on the Show access keys links and it will redirect to another page.

We can use any of the Subscription key or Regenerate the given key for getting image information using Computer Vision API.

Endpoint

As we mentioned above the location is same for all the free trail Subscription Key. In Azure we can choose available locations while creating a Computer Vision API. The following Endpoint we have used in our code.

https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze

View Model

The following model will contain the API image response information.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace VisionApiDemo.Models
{
public class Detail
{
public List<object> celebrities { get; set; }
}

public class Category
{
public string name { get; set; }
public double score { get; set; }
public Detail detail { get; set; }
}

public class Caption
{
public string text { get; set; }
public double confidence { get; set; }
}

public class Description
{
public List<string> tags { get; set; }
public List<Caption> captions { get; set; }
}

public class Color
{
public string dominantColorForeground { get; set; }
public string dominantColorBackground { get; set; }
public List<string> dominantColors { get; set; }
public string accentColor { get; set; }
public bool isBwImg { get; set; }
}

public class Metadata
{
public int height { get; set; }
public int width { get; set; }
public string format { get; set; }
}

public class ImageInfoViewModel
{
public List<Category> categories { get; set; }
public Description description { get; set; }
public Color color { get; set; }
public string requestId { get; set; }
public Metadata metadata { get; set; }
}
}

Request URL

We can add additional parameters or request parameters ( optional ) in our API “endPoint” and it will provide more information for the given image.

https://%5Blocation%5D.api.cognitive.microsoft.com/vision/v1.0/analyze%5B?visualFeatures%5D%5B&details%5D%5B&language%5D

Request parameters

Currently we can use 3 optional parameters.

visualFeatures
details
language

visualFeatures

The name itself clearly mentions it returns Visual Features of the given image. If we add multiple values in a visualFeatures parameters then put a comma for each value. The following are the visualFeatures parameters in API.

Categories
Tags
Description
Faces
ImageType
Color
Adult

details

This parameter will return domain specific information whether it is Celebrities or Landmarks.

Celebrities : If the detected image is of a celebrity it identify the same.

Landmarks : If the detected image is of a landmark it identify the same.

language

The service will return recognition results in specified language. Default language is english.

Supported languages.

en – English, Default.
zh – Simplified Chinese

Vision API Service

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using VisionApiDemo.Models;

namespace VisionApiDemo.Business_Layer
{
public class VisionApiService
{
const string subscriptionKey = "<Enter your subscriptionKey>";
const string endPoint =
"https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze";

public async Task<ImageInfoViewModel> MakeAnalysisRequest()
{
string imageFilePath = @"C:\Users\Rajeesh.raveendran\Desktop\Rajeesh.jpg";
var errors = new List<string>();
ImageInfoViewModel responeData = new ImageInfoViewModel();
try
{
HttpClient client = new HttpClient();

// Request headers.
client.DefaultRequestHeaders.Add(
"Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters. A third optional parameter is "details".
string requestParameters =
"visualFeatures=Categories,Description,Color";

// Assemble the URI for the REST API Call.
string uri = endPoint + "?" + requestParameters;

HttpResponseMessage response;

// Request body. Posts a locally stored JPEG image.
byte[] byteData = GetImageAsByteArray(imageFilePath);

using (ByteArrayContent content = new ByteArrayContent(byteData))
{
// This example uses content type "application/octet-stream".
// The other content types you can use are "application/json"
// and "multipart/form-data".
content.Headers.ContentType =
new MediaTypeHeaderValue("application/octet-stream");

// Make the REST API call.
response = await client.PostAsync(uri, content);
}

// Get the JSON response.
var result = await response.Content.ReadAsStringAsync();

if (response.IsSuccessStatusCode)
{

responeData = JsonConvert.DeserializeObject<ImageInfoViewModel>(result,
new JsonSerializerSettings
{
NullValueHandling = NullValueHandling.Include,
Error = delegate (object sender, Newtonsoft.Json.Serialization.ErrorEventArgs earg)
{
errors.Add(earg.ErrorContext.Member.ToString());
earg.ErrorContext.Handled = true;
}
}
);
}

}
catch (Exception e)
{
Console.WriteLine("\n" + e.Message);
}

return responeData;
}

static byte[] GetImageAsByteArray(string imageFilePath)
{
using (FileStream fileStream =
new FileStream(imageFilePath, FileMode.Open, FileAccess.Read))
{
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}
}
}
}

API Response – Based on the given Image

The successful json response.

{
"categories": [
{
"name": "people_group",
"score": 0.6171875,
"detail": {
"celebrities": []
}
},
{
"name": "people_many",
"score": 0.359375,
"detail": {
"celebrities": []
}
}
],
"description": {
"tags": [
"person",
"sitting",
"indoor",
"posing",
"group",
"people",
"man",
"photo",
"woman",
"child",
"front",
"young",
"table",
"cake",
"large",
"holding",
"standing",
"bench",
"room",
"blue"
],
"captions": [
{
"text": "a group of people sitting posing for the camera",
"confidence": 0.9833507086594954
}
]
},
"color": {
"dominantColorForeground": "White",
"dominantColorBackground": "White",
"dominantColors": [
"White",
"Black",
"Red"
],
"accentColor": "AD1E3E",
"isBwImg": false
},
"requestId": "89f21ccf-cb65-4107-8620-b920a03e5f03",
"metadata": {
"height": 346,
"width": 530,
"format": "Jpeg"
}
}

Download

Source Code

Output

Image information captured using Computer Vision API.For demo purpose, I have taken only a few data even though you can get more information about the image.

Reference

See Also

You can download other ASP.NET Core source codes from MSDN Code, using the link, mentioned below.

MSDN Gallery Samples

Summary

From this article we have learned how to implement One of the important Cognitive Services API ( Computer Vision API ). I hope this article is useful for all Azure Cognitive Services API beginners.

Rajeesh Menoth

Dream Of Lak

Tag Cognitive Services API

Cognitive Services : Creation of Language Understanding Intelligent Service or LUIS

Cognitive Services : Convert Text to Speech in multiple languages using Asp.Net Core & C#

Cognitive Services : Translate Text into multiple languages using Asp.Net Core & C#

Cognitive Services – Optical Character Recognition (OCR) from an image using Computer Vision API And C#

Cognitive Services : Analyze an Image Using Computer Vision API With ASP.Net Core & C#

Share this:

Share this:

Share this:

Share this:

Share this: