Microsoft is adding an automation tool to Windows, ready to manage our day-to-day tasks. Credit: KrulUA / Getty Images A lot of what we do with computers is repetitive. We put files in folders, we send form replies to emails, we make commits to git when we save files, we trigger tests when we build an application. Events trigger events, on down a predictable chain. It’s often the scaffolding around the work we’re actually doing, making reports and updating colleagues on progress. Much of this is the type of thing we might automate if we were passing information between enterprise applications and cloud services, using tools like Azure Logic Apps and Power Automate to manage information flows. These RPA (robotic process automation) tools are increasingly important, building on low- and no-code tools to build event-driven applications that take over and run our workflows. But it’s not something we usually do on our personal machines, even when those tasks break our flow and reduce productivity. In many cases, there’s a disconnect between what happens on our desktop or laptop PCs and what happens in servers. Actions that could easily be automated are ignored, as the tools we need to automate them aren’t part of the standard PC operating environment—what Microsoft increasingly calls “inbox applications,” the software installed alongside Windows that we know will be on most PCs. Introducing Power Automate Desktop At its recent March 2021 Ignite event, Microsoft made an interesting announcement, bringing the worlds of process automation and inbox apps together, by making its Power Automate Desktop tool part of the standard Windows install in future Windows releases (and quickly putting it in its Windows Insider dev channel releases). It’s intended to replace Microsoft’s own existing Windows Recorder automation tool or the open source Selenium desktop UI testing tools. You may know Power Automate Desktop by its pre-acquisition name, WinAutomate. Originally developed by Softomotive, it’s quickly been rebranded and is now part of Microsoft’s business-focused Power Apps platform. As a result, it’s a tool with two converging personalities: providing a desktop endpoint for cloud-hosted Power Apps as well as an environment for automating your own on-PC operations, bringing the two together. What you get from Power Automate Desktop depends on the account you sign in with, as the account enables different levels of integration with both the Microsoft Graph and the Dataverse common data model that underlies much of Power Apps. To get the most out of Power Automate Desktop, you need a subscription to a premium account, tied to an organization’s Microsoft 365 or Power Apps identity. You can get a free trial to try out the cloud integrations. With a Microsoft account, you can use it as a tool for working with local applications, automating them and using the tool’s built-in actions to integrate with common business services and Office applications. Work or school accounts get more access to Microsoft Graph features, and with a premium account (starting at $15/month/user), you can link your local process flows to the cloud Power Automate service and use them with its tools and features, including machine learning integrations. Getting started Once you’ve installed Power Automate Desktop you’re prompted to install its browser extension in Edge. It’s worth doing this. Once installed in Edge, you can use the extension to record and play back browser interactions, giving you the option of using it to automate web application testing. Versions of the extension are available for both Chrome and Firefox. If you need integration with the rest of Microsoft’s Power Platform, you’ll need to install a data gateway on all the machines that will run flows that are linked to the cloud, allowing you to trigger desktop flows from remote devices. Remote access isn’t necessary for many scenarios, but if you’re planning on linking desktop flows to, say, a webhook from GitHub Actions, you’ll need a premium subscription and the gateway installed. Once it’s installed, use the web-based Power Automate service to connect to your gateway and configure access. You can then bridge between the two services. If you’ve ever built a Power Automate flow, you should find working with the Desktop tool very similar. It gives you a design canvas for the various steps in a flow, as well as tools for capturing interactions with applications. Automating by capturing UI Using Power Automate Desktop to automate application UIs is one of its most useful features, either directly working with known application elements or recording interactions and then allowing you to customize key elements. Desktop applications and tools don’t have the same API-based development model as modern distributed applications, so any automation needs to fill forms and press buttons for you. Start by recording an application and using that as the framework for an automation. Once you have captured the interactions you want to use from an application, you can start to replace content with variables; for example, replacing dummy text from a capture with a text variable. If you’re planning to chain applications, you can use another capture to send content to that variable, using the Power Automate Desktop editing environment to put the captures in the correct order. Automating with actions You’re not limited to using the built-in capture tools. Power Automate Desktop comes with its own library of actions that can be used to build applications. Some let you add more complexity to your flow, adding conditionals and loops, as well as flow control rules that let you switch between subflows. The flow control tools are especially useful if you need to add error handling to a workflow, trapping errors and passing them to subflows that can be used to write error logs or display warning dialogs. Other actions provide direct access to common Windows functions so you don’t need to create a capture to open a file or work with a tool such as Excel. The built-in actions don’t cover all the functions of the supported applications, but they do give you enough coverage for most common tasks. More complex tasks can be handled using the UI automation tools, but in practice they’re not the regular tasks that desktop automation allows you to forget about. Some of the supported functions are surprising; for example, you can build calls to Azure Cognitive Services into a desktop flow. This way you can take a screenshot, use the clipboard tools in Power Automate Desktop to pass it to an OCR tool and then save the resulting text in a file. Here you’re chaining together several actions, running the flow after you’ve captured an image in the clipboard. Other tools plug into terminal sessions, automating mainframe applications. More usefully there’s the option to automate the Windows command line, so you can wrap multiple scripts into a single action, waiting for specific outputs before moving to the next and writing logs to an open file. The result is a surprisingly powerful set of tools that goes a lot further than traditional scripting tools. Being able to drive user interfaces directly adds flexibility, while direct support for familiar tools makes it easy to try out ideas and then convert them into automated tasks. The future of how we work It’s not perfect. Some applications are hard to automate, and others, like Windows Subsystem for Linux (WSL), aren’t yet supported. However, putting the tool in Windows shows commitment on Microsoft’s part, and it’ll be interesting to watch how it updates and how it supports not only knowledge and task workers, but also developer and operations workflows. Although it is possible to share Power Automate Desktop flows with other users, if you have a premium account, you’re really using it now to build personal tools. We all have our ways of working, built around our personal workflows. At the heart of what we do as developers are our toolchains, the applications we use to build, manage, and deploy code. Power Automate Desktop offers a way to turn those applications into a true chain where the output of one application drives the input of another. It brings the cloud’s distributed, event-driven programming model to the desktop, using the Windows UI as a universal API. With Power Automate Desktop set to be part of Windows, it’s worth downloading now, automating away small and repetitive tasks that get in the way of creative work. Related content feature What is Rust? Safe, fast, and easy software development Unlike most programming languages, Rust doesn't make you choose between speed, safety, and ease of use. Find out how Rust delivers better code with fewer compromises, and a few downsides to consider before learning Rust. By Serdar Yegulalp Nov 20, 2024 11 mins Rust Programming Languages Software Development how-to Kotlin for Java developers: Classes and coroutines Kotlin was designed to bring more flexibility and flow to programming in the JVM. Here's an in-depth look at how Kotlin makes working with classes and objects easier and introduces coroutines to modernize concurrency. By Matthew Tyson Nov 20, 2024 9 mins Java Kotlin Programming Languages analysis Azure AI Foundry tools for changes in AI applications Microsoft’s launch of Azure AI Foundry at Ignite 2024 signals a welcome shift from chatbots to agents and to using AI for business process automation. By Simon Bisson Nov 20, 2024 7 mins Microsoft Azure Generative AI Development Tools news Microsoft unveils imaging APIs for Windows Copilot Runtime Generative AI-backed APIs will allow developers to build image super resolution, image segmentation, object erase, and OCR capabilities into Windows applications. By Paul Krill Nov 19, 2024 2 mins Generative AI APIs Development Libraries and Frameworks Resources Videos