Paul Krill
Editor at Large

Microsoft unveils imaging APIs for Windows Copilot Runtime

news
19 Nov 20242 mins
APIsArtificial IntelligenceDevelopment Libraries and Frameworks

Generative AI-backed APIs will allow developers to build image super resolution, image segmentation, object erase, and OCR capabilities into Windows applications.

Generative AI, robotic hand with paintbrush
Credit: Neil Lockhart/Shutterstock

Microsoft’s Windows Copilot Runtime, which allows developers to integrate AI capabilities into Windows, is being fitted with AI-backed APIs for image processing. It will also gain access to Phi 3.5 Silica, a custom-built generative AI model for Copilot+ PCs.

Announced at this week’s Microsoft Ignite conference, the new Windows Copilot Runtime imaging APIs will be powered by on-device models that enable developers and ISVs to integrate AI within Windows applications securely and quickly, Microsoft said. Most of the APIs will be available in January through the Windows App SDK 1.7 experimental 2 Experimental release.

Developers will be able to bring AI capabilities into Windows apps via these APIs:

  • Image description, providing a text description of an image.
  • Image super resolution, increasing the fidelity of an image as well as upscaling the resolution of an image.
  • Image segmentation, enabling the separation of foreground and background of an image, along with removing specific objects or regions within an image. Image editing or video editing apps will be able to incorporate background removal using this API, which is powered by the Segment Anything Model (SAM).
  • Object erase, enabling erasing of unwanted objects from an image and blending the erased area with the remainder of the background.
  • Optical character recognition (OCR), recognizing and extracting text present within an image.

Phi 3.5 Silica, built from the Phi series of models, will be included in the Windows Copilot Runtime out of the box. It will be custom-built for the Snapdragon X series neural processing unit (NPU) in Copilot+ PCs, enabling text intelligence capabilities such as text summarization, text completion, and text prediction, Microsoft said.

Exit mobile version