Generative AI was the watchword at re:Invent 2023, as AWS rolled out new chips, foundation models, updates to its generative AI-based application building service Amazon Bedrock, a new generative AI assistant dubbed Amazon Q, support for vector databases and zero-ETL integrations. At the AWS re:Invent conference last week, the spotlight was focused on artificial intelligence, with the new generative AI assistant, Amazon Q, debuting as the star of the show. But there was plenty other news to spark the interest of database managers, data scientists, data engineers, and developers, including new extract, transform, load (ETL) services, a new Cost Optimization Hub, and revamped enterprise pricing tier for AWS’ cloud-based development tool, dubbed Amazon CodeCatalyst. Here are seven key takeaways from the conference: Beefed up infrastructure for generative AI The cloud services provider, which has been adding infrastructure capabilities and chips since the last year to support high-performance computing with enhanced energy efficiency, announced the latest iterations of its Graviton and the Trainium chips. The Graviton4 processor, according to AWS, provides up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than the current generation Graviton3 processors. Trainium2, on the other hand, is designed to deliver up to four times faster training than first-generation Trainium chips. At re:Invent, AWS also extended its partnership with Nvidia, including support for the DGX Cloud, a new GPU project named Ceiba, and new instances for supporting generative AI workloads. Nvidia also shared plans to integrate its NeMo Retriever microservice into AWS to help users with the development of generative AI tools like chatbots. NeMo Retriever is a generative AI microservice that enables enterprises to connect custom large language models (LLMs) to enterprise data, so the company can generate proper AI responses based on their own data. Further, AWS said that it will be the first cloud provider to bring Nvidia’s GH200 Grace Hopper Superchips to the cloud. New foundation models for Amazon Bedrock Updated models added to Bedrock include Anthropic’s Claude 2.1 and Meta Llama 2 70B, both of which have been made generally available. Amazon also has added its proprietary Titan Text Lite and Titan Text Express foundation models to Bedrock. In addition, the cloud services provider has added a model in preview, Amazon Titan Image Generator, to the AI app-building service. AWS also has released a new feature within Bedrock that allows enterprises to evaluate, compare, and select the best foundational model for their use case and business needs. Dubbed Model Evaluation on Amazon Bedrock and currently in preview, the feature is aimed at simplifying several tasks such as identifying benchmarks, setting up evaluation tools, and running assessments, the company said, adding that this saves time and cost. Updates to Amazon SageMaker for supporting generative AI In order to help enterprises train and deploy large language models efficiently, AWS introduced two new offerings — SageMaker HyperPod and SageMaker Inference — within its Amazon SageMaker AI and machine learning service. In contrast to the manual model training process — which is prone to delays, unnecessary expenditure and other complications — HyperPod removes the heavy lifting involved in building and optimizing machine learning infrastructure for training models, reducing training time by up to 40%, the company said. SageMaker Inference, on the other hand, is targeted at helping enterprise reduce model deployment cost and decrease latency in model responses. In order to do so, Inference allows enterprises to deploy multiple models to the same cloud instance to better utilize the underlying accelerators. AWS has also updated its low code machine learning platform targeted at business analysts, SageMaker Canvas. Analysts can use natural language to prepare data inside Canvas in order to generate machine learning models, said Swami Sivasubramanian, head of database, analytics and machine learning services for AWS. The no code platform supports LLMs from Anthropic, Cohere, and AI21 Labs. SageMaker also now features the Model Evaluation capability, now called SageMaker Clarify, which can be accessed from within the SageMaker Studio. Amazon Q — the generative AI assistant for everything Last Tuesday, AWS CEO Adam Selipsky premiered the star of the cloud giant’s re:Invent 2023 conference: Amazon Q, the company’s answer to Microsoft’s GPT-driven Copilot generative AI assistant. Amazon Q can be used by enterprises across a variety of functions including developing applications, transforming code, generating business intelligence, acting as a generative AI assistant for business applications, and helping customer service agents via the Amazon Connect offering. Amazon Braket for reserving quantum computers The cloud services provider has announced a new program, dubbed Amazon Braket Direct, to offer researchers direct, private access to quantum computers. The program is part of AWS’ managed quantum computing service, named Amazon Braket, which was introduced in 2020. Amazon Bracket Direct allows researchers across enterprises to get private access to the full capacity of various quantum processing units (QPUs) without any wait time and also provides the option to receive expert guidance for their workloads from AWS’ team of quantum computing specialists, AWS said. Currently, the Direct program supports the reservation of IonQ Aria, QuEra Aquila, and Rigetti Aspen-M-3 quantum computers. The IonQ is priced at $7,000 per hour and the QuEra Aquila is priced at $2,500 per hour. The Aspen-M-3 is priced slightly higher at $3,000 per hour. Cost Optimization Hub to help enterprises reduce spending The updates announced at re:Invent include a new AWS Billing and Cost Management feature, dubbed AWS Cost Optimization Hub, which makes it easy for enterprises to identify, filter, aggregate, and quantify savings for AWS cost optimization recommendations. The new Hub, according to the cloud services provider, gathers all cost-optimizing recommended actions across AWS Cloud Financial Management (CFM) services, including AWS Cost Explorer and AWS Compute Optimizer, in one place. It incorporates customer-specific pricing and discounts into these recommendations, and it deduplicates findings and savings to give a consolidated view of an enterprise’s cost optimization opportunities, AWS added. The feature is likely to help FinOps or infrastructure management teams understand cost optimization opportunities. Zero-ETL, vector databases and other updates Continuing to build on its efforts toward zero-ETL for data warehousing services, AWS announced new Amazon RedShift integrations with Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon RDS for MySQL. Enterprises, typically, use extract, transform, load (ETL) to integrate data from multiple sources into a single consistent data store to be loaded into a data warehouse for analysis. However, most data engineers claim that transforming data from disparate sources could be a difficult and time-consuming task as the process involves steps such as cleaning, filtering, reshaping, and summarizing the raw data. Another issue is the added cost of maintaining teams that prepare data pipelines for running analytics, AWS said. In contrast, the new zero-ETL integrations, according to the company, eliminate the need to perform ETL between Aurora PostgreSQL, DynamoDB, RDS for MySQL, and RedShift as transactional data in these databases can be replicated into RedShift almost immediately and is ready for running analysis. Other generative AI-related updates at re:Invent include updated support for vector databases for Amazon Bedrock. These databases include Amazon Aurora and MongoDB. Other supported databases include Pinecone, Redis Enterprise Cloud, and Vector Engine for Amazon OpenSearch Serverless. The company also added a new enterprise pricing tier to its cloud-based development tool, dubbed Amazon CodeCatalyst. Related content analysis Azure AI Foundry tools for changes in AI applications Microsoft’s launch of Azure AI Foundry at Ignite 2024 signals a welcome shift from chatbots to agents and to using AI for business process automation. By Simon Bisson Nov 20, 2024 7 mins Microsoft Azure Generative AI Development Tools analysis Succeeding with observability in the cloud Cloud observability practices are complex—just like the cloud deployments they seek to understand. The insights observability offers make it a challenge worth tackling. By David Linthicum Nov 19, 2024 5 mins Cloud Management Cloud Computing news Akka distributed computing platform adds Java SDK Akka enables development of applications that are primarily event-driven, deployable on Akka’s serverless platform or on AWS, Azure, or GCP cloud instances. By Paul Krill Nov 18, 2024 2 mins Java Scala Serverless Computing analysis Strategies to navigate the pitfalls of cloud costs Cloud providers waste a lot of their customers’ cloud dollars, but enterprises can take action. By David Linthicum Nov 15, 2024 6 mins Cloud Architecture Cloud Management Cloud Computing Resources Videos