Simon Bisson
Contributor

Exploring Azure Spatial Analysis containers

analysis
Dec 08, 20207 mins
Cloud ComputingMicrosoft AzureSoftware Development

Use containers and Azure IoT Hub to bring Cognitive Services machine learning to your own servers on the edge.

Business agility
Credit: Istock

Azure’s Cognitive Services are a quick and easy way to add machine learning to many different types of applications. Available as REST APIs, they can be quickly hooked into your code using simple asynchronous calls from their own dedicated SDKs and libraries. It doesn’t matter what language or platform you’re building on, as long as your code can deliver HTTP calls and parse JSON documents.

Not all applications have the luxury of a low-latency connection to Azure. That’s why Microsoft is rolling out an increasing number of its Cognitive Services as containers, for use on appropriate hardware that may only have intermittent connectivity. That often requires using systems with a relatively high-end GPU, as the underlying neural nets used by the ML inferencing models require a lot of compute. Even so, with devices like Intel’s NUC9 hardware with an Nvidia Tesla-series GPU, that can be very small indeed.

Packaging Azure Cognitive Services

At the heart of the Cognitive Services suite are Microsoft’s computer vision models. They manage everything from object recognition and image analysis to object detection and tagging to character and handwriting recognition. They’re useful tools that can form the basis of complex applications and feed into either serverless Azure Functions or into no-code Power Apps.

Microsoft has taken some of its computer vision modules and packaged them in containers for use on low-latency edge hardware or where regulations require data to be held inside your own data center. That means you can use the OCR container to capture data from pharmacies securely or use the spatial analysis container to deliver a secure and safe work environment.

With businesses struggling to manage social distancing and safe working conditions during the current pandemic, tools like the Cognitive Services spatial analysis are especially important. With existing camera networks or relatively low-cost devices, such as the Azure Kinect Camera, you can build systems that can identify people and show if they are working safely: keeping away from dangerous equipment, maintaining a safe separation, or being in a well-ventilated space. All you need is an RTSP (real time streaming protocol) stream from each camera you’re using and an Azure subscription.

Setting up spatial analysis

Getting started with the spatial analysis container is easy enough. It’s intended for use with Azure IoT Edge, which manages container deployment, and requires a server with at least one Nvidia Tesla GPU. Microsoft recommends its own Azure Stack Edge hardware, as this now offers a T4 GPU option. Using Azure Stack Edge reduces your capital expenditure, as the hardware and software is managed from Azure and billed through an Azure subscription. For test and development, a desktop is good enough; the recommended hardware is a fairly hefty workstation-class PC with 32GB of RAM and two Tesla T4 GPUs with 16GB of GPU RAM.

Any system running the container needs to have Docker with Nvidia GPU support, Nvidia’s CUDA tool and multiprocess support, and the Azure IoT Edge runtime, all on Ubuntu 18.04. Your cameras need to deliver an H.264 encoded stream over RTSP at 15fps and 1080p.

The spatial analytics container is currently in preview, so you do need approval from the Cognitive Services team before you access it in Microsoft’s container registry. The approval form will ask what type of application you’re building and how it’s intended to be used. Once you get approval you can install Azure IoT Hub, either directly on an Azure Stack Edge device via the Azure CLI or after installing the Azure IoT Edge on your Linux PC. Once that’s in place, you can connect it to an Azure-hosted IoT Hub using the connect strings you can generate from the Azure command line. This will install the container on your edge hardware.

It’s a lot easier to get things going on Azure Stack Edge hardware. Here the container runs in a managed Kubernetes cluster, set up as part of the Azure Stack Edge compute tools from the Azure Portal. This will host your IoT Hub Edge compute, hosting the Azure IoT Edge runtime. Again the container is deployed using an Azure CLI command, installing it as a module that can be managed from your IoT Hub instance. You will be able to see the container’s status from the Azure portal.

Once deployed and installed, you can start to use the spatial analysis container in your code, connecting to it using the Azure Event Hub SDK. Local code can work directly with the default end point; if you want to use data elsewhere in your network you will need to route messages to another end point or to Azure blob storage.

Using spatial analysis to detect people

Once the spatial analysis container is running, it’s time to start exploring its features. These take in an RTSP stream and deliver JSON documents. There are several different operations: counting the people in a zone, spotting when people cross a designated line or edge of a polygon, and tracking violations of distance rules. All the operations can be carried out on either individual frames or on live video. Like most IoT Hub services, the spatial analytics container outputs are delivered as events.

You will need to configure each operation before using it, giving it a URL for the camera RTSP feed, an indication of whether you’re using live or recorded data, and a JSON definition of the line or area you want to analyze. More than one zone can be set up for each camera field of view. By setting thresholds for counts, you can define the confidence level of the model before triggering an event. Once you have the count data, you can trigger an alert if, for example, too many people are in a space. This can help businesses manage space in accordance with regulations, changing thresholds as regulations change. Lines are treated the same way as zones, triggering counts when the line is crossed.

Microsoft demonstrated Azure Cognitive Services to manage site safety at its BUILD event back in 2018. It’s taken a while for these tools to arrive in preview, but it’s easy to see how you can combine spatial analytics and other computer vision services to continuously monitor and protect workspaces. A robot plasma cutter in a dynamic safety zone can quickly be shut it down if a worker enters its operation area. Or maybe you’re watching for workers on high catwalks after a frost, where you want them to avoid slipping.

You can provide a set of operations that are quick to configure and easy to manage using Event Hubs and Event Grids to route events to the appropriate application. Now you’re able to build applications that can work with multiple cameras and mix local and cloud operations. You could mix object recognition with line crossing; for example, to detect if someone is too close to a gas pump while smoking a cigarette.

Using spatial analysis outputs

Outputs are JSON documents with both event and source information, ready for further processing and logging. It’s sensible to keep logs of detected incidents in order to refine the thresholds for detection. If a model is too sensitive, you can ramp up the detection threshold to reduce false positives. Logging events can help with any regulatory issues for compliance with safety rules or privacy requirements.

By only providing count or event data, Microsoft falls clearly on the side of privacy with its spatial analysis tools. There’s no identification; just an alert that too many people are in a space or someone has crossed a line. It’s a sensible compromise that lets you build essential safety tools while ensuring that no personal information moves from your network to Microsoft’s.

Taking an event-driven approach simplifies building applications. You can use Event Grid to route messages where necessary, drive serverless Functions, or trigger low- and no-code process automations. Using Event Grid as a publish and subscribe backbone increases the number of applications that can work with spatial analysis events, distributing them as needed without having to build your own messaging architecture.

Simon Bisson
Contributor

Author of InfoWorld's Enterprise Microsoft blog, Simon Bisson prefers to think of “career” as a verb rather than a noun, having worked in academic and telecoms research, as well as having been the CTO of a startup, running the technical side of UK Online (the first national ISP with content as well as connections), before moving into consultancy and technology strategy. He’s built plenty of large-scale web applications, designed architectures for multi-terabyte online image stores, implemented B2B information hubs, and come up with next generation mobile network architectures and knowledge management solutions. In between doing all that, he’s been a freelance journalist since the early days of the web and writes about everything from enterprise architecture down to gadgets.

More from this author