Repatriation is one route to cost savings. Switching development patterns from long-running services to WebAssembly-powered serverless functions is another. Credit: Piotr Zajda / Shutterstock Buzz is building around the idea that it’s time to claw back our cloud services and once more rebuild the company data center. Repatriation. It’s the act of moving work out of cloud and back to on-premises or self-managed hardware. And the primary justification for this movement is straightforward, especially in a time of economic downturn. Save money by not using AWS, Azure, or the other cloud hosting services. Save money by building and managing your own infrastructure. Since an Andreesen Horowitz post catapulted this idea into the spotlight a couple of years ago, it seems to be gaining momentum. 37Signals, makers of Basecamp and Hey (a for-pay webmail service), blog regularly about how they repatriated. And a recent report suggested that of all those talking about a move back to self-hosting, the primary reason was financial: 45% said it’s because of cost. It should be no surprise that repatriation has gained this hype. Cloud, which grew to maturity during an economic boom, is for the first time under downward pressure as companies seek to reduce spending. Amazon, Google, Microsoft, and other cloud providers have feasted on their customers’ willingness to spend. But the willingness has been tempered now by budget cuts. What is the most obvious response to the feeling that cloud has grown too expensive? It is the clarion call of repatriation: Move it all back in-house! Kubernetes is expensive in practice Cloud has turned out to be expensive. The culprit may be the technologies we’ve built in order to better use the cloud. While we could look at myriad add-on services, the problem arises at the most basic level. We will focus just on cloud compute. The original value proposition of hosted virtual machines (the OG cloud compute) was that you could take your entire operating system, package it up, and ship it to somewhere else to run. But the operational part of this setup—the thing we asked our devops and platform engineering teams to deal with—was anything but pleasant. Maintenance was a beast. Management tools were primitive. Developers did not participate. Deployments were slower than molasses. Along came Docker containers. When it came to packaging and deploying individual services, containers gave us a better story than VMs. Developers could easily build them. Startup times were measured in seconds, not minutes. And thanks to a little project out of Google called Kubernetes, it was possible to orchestrate container application management. But what we weren’t noticing while we were building this brave new world is the cost we were incurring. More specifically, in the name of stability, we downplayed cost. In Kubernetes, the preferred way to deploy an application runs at least three replicas of every application running, even when inbound load does not justify it. 24×7, every server in your cluster is running in triplicate (at least), consuming power and resources. On top of this, we layered a chunky stew of sidecars and auxiliary services, all of which ate up more resources. Suddenly that three-node “starter” Kubernetes cluster was seven nodes. Then a dozen. According to a recent CNCF survey, a full 50% of Kubernetes clusters have more than 50 nodes. Cluster cost skyrocketed. And then we all found ourselves in that ignoble position of installing “cost control” tooling to try to tell us where we could squeeze our Kubernetes cluster into our new skinny jeans budget. Discussing this with a friend recently, he admitted that at this point his company’s Kubernetes cluster was tuned on a big gamble: Rather than provision for how many resources they needed, they under-provisioned on the hope that not too many things would fail at once. They downsized their cluster until the startup requirements of all of their servers would, if mutually triggered, exhaust the resources of the entire cluster before they could be restarted. As a broader pattern, we are now trading stability and peace of mind for a small percentage cut in our cost. It’s no wonder repatriation is raising eyebrows. Can we solve the problem in-cloud? But what if we asked a different question? What if we asked if there was an architectural change we could make that would vastly reduce the resources we were consuming? What if we could reduce that Kubernetes cluster back down to the single-digit size not by tightening our belts and hoping nothing busts, but by building services in a way that is more cost-sustainable? The technology and the programming pattern are both here already. And here’s the spoiler: The solution is serverless WebAssembly. Let’s take those terms one at a time. Serverless functions are a development pattern that has gained huge momentum. AWS, the largest cloud provider, says they run 10 trillion serverless functions per month. That level of vastness is mind-boggling. But it is a promising indicator that developers appreciate that modality, and they are building things that are popular. The best way to think about a serverless function is as an event handler. A particular event (an HTTP request, and item landing on a queue, etc.) triggers a particular function. That function starts, handles the request, and then immediately shuts down. A function may run for milliseconds, seconds, or perhaps minutes, but no longer. So what is the “server” we are doing without in serverless? We’re not making the wild claim that we’re somehow beyond server hardware. Instead, “serverless” is a statement about the software design pattern. There is no long-running server process. Rather, we write just a function—just an event handler. And we leave it to the event system to trigger the invocation of the event handler. And what falls out of this definition is that serverless functions are much easier to write than services—even traditional microservices. The simple fact that serverless functions require less code, which means less development, debugging, and patching. This in and of itself can lead to some big results. As David Anderson reported in his book The Value Flywheel Effect: “A single web application at Liberty Mutual was rewritten as serverless and resulted in reduced maintenance costs of 99.89%, from $50,000 a year to $10 a year.” (Anderson, David. The Value Flywheel Effect, p. 27.) Another natural result of going serverless is that we then are running smaller pieces of code for shorter periods of time. If the cost of cloud compute is determined by the combination of how many system resources (CPU, memory) we need and how long we need to access those resources, then it should be clear immediately that serverless should be cheaper. After all, it uses less and it runs for milliseconds, seconds, or minutes instead of days, weeks, or months. The older, first-generation serverless architectures did cut cost somewhat, but because underneath the hood were actually bulky runtimes, the cost of a serverless function grew substantially over time as a function handled more and more requests. This is where WebAssembly comes in. WebAssembly as a better serverless runtime As a highly secure isolated runtime, WebAssembly is a great virtualization strategy for serverless functions. A WebAssembly function takes under a millisecond to cold start, and requires scanty CPU and memory to execute. In other words, they cut down both time and system resources, which means they save money. How much do they cut down? Cost will vary depending on the cloud and the number of requests. But we can compare a Kubernetes cluster using only containers with one that uses WebAssembly. A Kubernetes node can execute a theoretical maximum of just over 250 pods. Most of the time, a moderately sized virtual machine actually hits memory and processing power limits at just a few dozen containers. And this is because containers are consuming resources for the entire duration of their activity. At Fermyon we’ve routinely been able to run thousands of serverless WebAssembly apps on even modestly sized nodes in a Kubernetes cluster. We recently load tested 5,000 serverless apps on a two-worker node cluster, achieving in excess of 1.5 million function invocations in 10 seconds. Fermyon Cloud, our public production system, runs 3,000 user-built applications on each 8-core, 32GiB virtual machine. And that system has been in production for over 18 months. In short, we’ve achieved efficiency via density and speed. And efficiency directly translates to cost savings. Safer than repatriation Repatriation is one route to cost savings. Another is switching development patterns from long-running services to WebAssembly-powered serverless functions. While they are not, in the final analysis, mutually exclusive, one of the two is high-risk. Moving from cloud to on-prem is a path that will change your business, and possibly not for the better. Repatriation is predicated on the idea that the best thing we can do to control cloud spend is to move all of that stuff—all of those Kubernetes clusters and proxies and load balancers—back into a physical space that we control. Of course, it goes without saying that this involves buying space, hardware, bandwidth, and so on. And it involves transforming the operations team from a software and services mentality to a hardware management mentality. As someone who remembers (not fondly) racking servers, troubleshooting broken hardware, and watching midnight come and go as I did so, the thought of repatriating inspires anything but a sense of uplifting anticipation. Transitioning back to on-premises is a heavy lift, and one that is hard to rescind should things go badly. And savings is yet to be seen until after the transition is complete (in fact, with the capital expenses involved in the move, it may be a long time until savings are realized). Switching to WebAssembly-powered serverless functions, in contrast, is less expensive and less risky. Because such functions can run inside of Kubernetes, the savings thesis can be tested merely by carving off a few representative services, rewriting them, and analyzing the results. Those already invested in a microservice-style architecture are already well setup to rebuild just fragments of a multi-service application. Similarly, those invested in event processing chains like data transformation pipelines will also find it easy to identify a step or two in a sequence that can become the testbed for experimentation. Not only is this a lower barrier to entry, but whether it works or not, there is always the option to select repatriation without having to perform a second about-face, as WebAssembly serverless functions work just as well on-prem (or on edge, or elsewhere) as they do in the cloud. Cost Savings at What Cost? It is high time that we learn to control our cloud expenses. But there are far less drastic (and risky) ways of doing this than repatriation. It would be prudent to explore the cheaper and easier solutions first before jumping on the bandwagon… and then loading it full of racks of servers. And the good news is that if I am wrong, it’ll be easy to repatriate those open-source serverless WebAssembly functions. After all, they are lighter, faster, cheaper, and more efficient than yesterday’s status quo. One easy way to get started with cloud-native WebAssembly is to try out the open-source Spin framework. And if Kubernetes is your target deployment environment, an in-cluster serverless WebAssembly environment can be installed and managed by the open-source SpinKube. In just a few minutes, you can get a taste for whether the solution to your cost control needs may not be repatriation, but building more efficient applications that make better use of your cloud resources. Matt Butcher is co-founder and CEO of Fermyon, the serverless WebAssembly in the cloud company. He is one of the original creators of Helm, Brigade, CNAB, OAM, Glide, and Krustlet. He has written and co-written many books, including “Learning Helm” and “Go in Practice.” He is a co-creator of the “Illustrated Children’s Guide to Kubernetes’ series. These days, he works mostly on WebAssembly projects such as Spin, Fermyon Cloud, and Bartholomew. He holds a Ph.D. in philosophy. He lives in Colorado, where he drinks lots of coffee. — New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com. Related content feature What is Rust? Safe, fast, and easy software development Unlike most programming languages, Rust doesn't make you choose between speed, safety, and ease of use. Find out how Rust delivers better code with fewer compromises, and a few downsides to consider before learning Rust. By Serdar Yegulalp Nov 20, 2024 11 mins Rust Programming Languages Software Development how-to Kotlin for Java developers: Classes and coroutines Kotlin was designed to bring more flexibility and flow to programming in the JVM. Here's an in-depth look at how Kotlin makes working with classes and objects easier and introduces coroutines to modernize concurrency. By Matthew Tyson Nov 20, 2024 9 mins Java Kotlin Programming Languages analysis Azure AI Foundry tools for changes in AI applications Microsoft’s launch of Azure AI Foundry at Ignite 2024 signals a welcome shift from chatbots to agents and to using AI for business process automation. By Simon Bisson Nov 20, 2024 7 mins Microsoft Azure Generative AI Development Tools news Microsoft unveils imaging APIs for Windows Copilot Runtime Generative AI-backed APIs will allow developers to build image super resolution, image segmentation, object erase, and OCR capabilities into Windows applications. By Paul Krill Nov 19, 2024 2 mins Generative AI APIs Development Libraries and Frameworks Resources Videos