Matt Asay
Contributor

AWS is changing

analysis
Dec 05, 20225 mins
Amazon Web ServicesCloud ComputingOpen Source

Announcements at AWS re:Invent show a kinder, gentler Amazon, ready to integrate its own services and third-party data sources.

Computerworld's 100 Best Places to Work in IT [2017] - Welcome to the Club
Credit: Thinkstock

After what struck me as a relatively dry spell of product announcements in 2021, AWS spent re:Invent 2022 launching a host of new services. AWS Chief Evangelist Jeff Barr, with help from some AWS developer advocates, summarized the most impactful announcements because “there’s simply too much great stuff for the team to cover,” but then they proceeded to spend more than 2,700 words highlighting their favorite announcements, which seemed to include… everything. Basically, they handed out participation trophies to every AWS service team. Not particularly helpful.

They could have highlighted automated data preparation for Amazon QuickSight Q, given how difficult data preparation can be for machine learning. Or what about Amazon Security Lake, which automatically centralizes a company’s security data from cloud and on-premises sources into a data lake. Very cool. What about Amazon CodeCatalyst, which RedMonk analyst James Governor rightly characterizes as “a packaging exercise” designed to improve software development and delivery and lead to greater convenience (“the killer app”). Also, very cool.

If we look beyond the gazillion new services and updates to existing services that AWS announced, an emerging theme portends a dramatically different (and better) AWS. Yes, I’m talking about integration as an essential product feature.

Less assembly required

AWS used to tout its 200+ services. Not anymore. In fact, there are probably closer to 400 AWS services now, but at some point in the past two years, AWS realized that having so many services complicated customers’ IT decisions rather than simplifying them.

For those unfamiliar with how AWS operates, each service (product) team runs autonomously. There is some top-down direction, but as a general rule, individual service teams build what they feel customers most want, even if that leads to inter-team competition. This is both a feature (autonomous teams can build faster) and a bug (autonomous teams don’t necessarily coordinate to make it easy to use multiple AWS services harmoniously). Customers are often left to cobble together disparate services without tight integration in the way Microsoft might provide, for example.

All this makes the introduction of Amazon Aurora zero-ETL integration with Amazon Redshift such a jaw-dropper.

Let’s be clear: In essence, AWS announced that two of its services now work well together. It’s more than that, of course. Removing the cost and complexity of ETL is a great way to remove the need to build data pipelines. At heart, this is about making two AWS services work exceptionally well together. For another company, this might be considered table stakes, but for AWS, it’s relatively new and incredibly welcome.

It’s also a sign of where AWS may be headed: tighter integration between its own services so that customers needn’t take on the undifferentiated heavy lifting of AWS service integration.

Making room for third parties

That zero ETL announcement, as potent as it was, would have been even better had AWS also highlighted seamless integrations with third-party services such as Databricks or DataStax. AWS may not like to use the “P” word (“platform”), but that doesn’t change reality. AWS is the world’s largest cloud platform, and AWS customers rightly expect to be able to integrate their preferred software with AWS.

This is what makes Amazon DataZone so interesting.

Amazon DataZone is a “data management service that helps you catalog, discover, analyze, share, and govern data across the organization,” writes Swami Sivasubramanian, AWS vice president of data and machine learning. This would be cool if all it did was pull together all the data stored in repositories from various AWS services, which it does with integrations to AWS services like Redshift, Athena, QuickSight, and more. DataZone goes beyond this by offering APIs to integrate with third-party data sources from partners or others.

On the one hand, it’s obvious that AWS would have to provide such APIs, because of course, not all (or even most) customer data sits in AWS. In the FAQ accompanying the announcement, AWS even mentioned that DataZone can track data in rival cloud providers like Google Cloud and Microsoft Azure—multicloud, anyone? But it’s also not obvious. After all, the tech industry has spent decades watching Apple, Microsoft, and others ignore competitive products outside their own walled gardens. By emphasizing the need to access non-AWS data sources, DataZone may well be a leading indicator of AWS going beyond grudging acceptance of third-party data sources or services to emphatic embrace.

Opening up

Then there was the announcement that wasn’t an announcement at all. AWS announced Trusted Language Extensions for PostgreSQL on Amazon Aurora and Amazon RDS. PG.TLE is an open source development kit for building PostgreSQL extensions. It “provides database administrators control over who can install extensions and a permissions model for running them, letting application developers deliver new functionality as soon as they determine an extension meets their needs.”

Nice, right?

What wasn’t announced and never will be is the fact that AWS is arguably the second-largest employer of PostgreSQL contributors, just behind CrunchyData. I’ve suggested before that AWS has increasingly seen the need to contribute to the open source projects upon which its managed services (and its customers) depend. AWS employee contributions to PostgreSQL are a strong example of this.

All of this suggests that AWS is becoming less insular every day. The company has always viewed “customer obsession” as its most important success metric, and sometimes service teams felt the right way to achieve that was to build the best possible service in isolation from the customer’s existing IT investments, including other AWS services. It also led some teams to limit their involvement in upstream open source projects and try to deliver a self-contained version of that project so as to better control the customer experience.

As these and other re:Invent announcements suggest, AWS increasingly builds community—whether partners, open source projects, or even other competitive products—into its services. That’s great for customers and great for AWS.

Matt Asay
Contributor

Matt Asay runs developer relations at MongoDB. Previously. Asay was a Principal at Amazon Web Services and Head of Developer Ecosystem for Adobe. Prior to Adobe, Asay held a range of roles at open source companies: VP of business development, marketing, and community at MongoDB; VP of business development at real-time analytics company Nodeable (acquired by Appcelerator); VP of business development and interim CEO at mobile HTML5 start-up Strobe (acquired by Facebook); COO at Canonical, the Ubuntu Linux company; and head of the Americas at Alfresco, a content management startup. Asay is an emeritus board member of the Open Source Initiative (OSI) and holds a J.D. from Stanford, where he focused on open source and other IP licensing issues.

More from this author