Major upgrade to the ever-evolving Oracle Database brings JavaScript support, graph optimizations, in-memory enhancements, and dramatic improvements to JSON operations and in-database machine learning. Credit: Thinkstock Oracle Database 21c, the new release of the longtime industry leading RDBMS, is currently available in the Oracle Cloud, where it can be deployed as a Virtual Machine DB System (for clusters and single instance) or a Bare Metal DB System (single instance). It’s also available in the Always Free Autonomous Database Service, with limited storage and CPU capacity. It will be available more broadly later this year, including on-premises. While Oracle Database 20c was scheduled for release in 2020, that release was cancelled because of the COVID-19 pandemic and its effect on customer IT departments. All of the features planned for 20c were rolled into 21c. There are several notable points to be made about Oracle Database 21c. It can serve as a transactional, analysis, or enterprise database, and tune itself for its usage. It supports relational, time series, graph, spatial, text, OLAP, XML, and JSON data. It has in-database training of machine learning models that includes AutoML, and that training can take advantage of the extra CPUs in storage servers on Exadata hardware. Speaking of Exadata hardware, Oracle Database 21c takes full advantage of its Intel Optane persistent memory, and the Autonomous Database supports automatic index creation, which is also available for Oracle Databases running on Exadata, in the cloud or on-premises. For analysis, Oracle Database 21c can automatically configure which columns to place in memory, can perform SIMD vectorized hash joins, and can complete hybrid scans against in-memory columns and on-disk row stores. For in-database programming, Oracle Database 21c adds support for JavaScript, along with interfaces between JavaScript and Oracle Database types, SQL, and PL/SQL stored procedures. Oracle claims high performance for all of its converged data types, even compared against databases specialized for JSON documents or graphs. The hard benchmark numbers for Oracle Database 21c are not yet available, however. Oracle Database history Oracle Database 21c is the result of four decades of development. Oracle Database was the first commercial SQL-based relational database (1979), implemented based on a published IBM specification. (IBM held off releasing its own relational database, DB2, to avoid cannibalizing sales of its hierarchical database, IMS.) Briefly, Oracle Version 3 (1983) was a rewrite in portable C, to run on mainframes, minicomputers, and personal computers. Version 6 had enhancements to disk I/O, row locking, scalability, and backup and recovery. Oracle7 (1992) introduced PL/SQL stored procedures and triggers. Oracle8 had objects and table partitioning. Oracle8i (1999) provided native support for internet protocols and server-side support for Java. Oracle9i introduced Oracle Real Application Clusters (RAC) in 2001; RAC is a shared disk, multiple instance architecture. Oracle Database 10g introduced grid computing in 2003. Oracle Database 11g (2007) improved manageability, diagnosability, and availability. Oracle Database 12c (2013) was designed for the cloud, featuring a new multitenant architecture, in-memory column store, and support for JSON documents. And Oracle Database 21c improves the developer experience with features such as Oracle Blockchain Tables and native JSON data types. Enhancements to Automatic In-Memory make the in-memory column store largely self-managing. The Oracle Database is only one of Oracle’s many product lines. Oracle also offers nearly a dozen application development tools for Oracle Database, four data integration tools, nine other databases including MySQL and Berkeley DB, and half a dozen big data products — and that’s without going into Java, storage, operating systems, applications, or the Oracle Cloud. Oracle Database overview The Oracle Database 21c Technical Architecture document runs to 40 pages with block diagrams on almost every page (see the first two below). I don’t have room for that, nor would you want to read it here. Instead, let’s hit the high points. First of all, Oracle is a converged database engine. It offers many data types and data models, several workload types, and a bunch of ancillary capabilities for developers and analysts. Oracle Database supports relational, time series, graph, spatial, text, OLAP, XML, and JSON data — all at once. It supports transactional (OLTP), analytic (OLAP), machine learning (ML), IoT, streaming, and blockchain workloads. It has integrated microservices, events, REST services, in-database ML training, CI/CD, and low-code application development. Again, that’s all in one database, rather than requiring an assortment of special-purpose databases. Oracle contends that a converged database engine saves customers money, reduces data transport and transformation, and simplifies analytics and applications. Oracle offers database availability on-premises, in its public cloud, and in an extension of its cloud running on-prem. Oracle Database can be hosted, co-managed, or fully managed (autonomous). Autonomous Database can perform automated scaling, automated tuning, automated provisioning, automated fault management, and automated patching. Automated tuning means that the execution of your Oracle Databases is continually analyzed and tuned to optimize database performance for application development, production transaction processing, and data warehousing workloads that change over time. That includes creating indexes based on usage, which is especially important for data warehouses. Exadata hardware and software underlies Autonomous Database. More explicitly, Exadata powers Oracle Exadata Database Machine (dedicated on-prem), Oracle Exadata Cloud Service, and Oracle Exadata Cloud@Customer (cloud on-prem). Exadata Database Machine X8M (the latest model) has a high-availability, scale-out design, separate database and storage servers, uses Intel Optane persistent memory (PMEM) as part of a layered storage system, and uses in-memory columnar hybrid compressed format with vector processing for analytics and reporting. X8M takes advantage of remote direct memory access (RDMA) over converged 100Gbps Ethernet (RoCE) for internal interconnects, which completely bypasses the normal operating system stacks. An Oracle database is a multi-tenant container database (CDB) that holds one system seed pluggable database and any number of user-created pluggable databases (PDBs). (See the third diagram below.) Users interact only with the PDBs, and a user may never realize that the PDB they are using isn’t a standalone database. In the past, Oracle also supported non-CDB database instances, but that option is no longer supported in Oracle Database 21c. An Oracle Real Application Clusters (RAC) database architecture consists of multiple instances that run on separate server machines, all sharing a single database. The cluster presents a single end point to users and applications, through a listener process. RAC offers even higher availability than single Exadata machines. Traditional Oracle databases were strictly disk-based, which limited their performance. In recent years, Oracle added an in-memory option for “hot” data. Oracle Database In-Memory is a suite of features that greatly improves performance for real-time analytics and mixed workloads. The in-memory column store is the key feature of Database In-Memory. Transactional queries always go to the row store; the in-memory column store is only for analytic and reporting queries. Oracle Oracle Database 21c server block diagram. This is the highest-level diagram. Note that the Listener process often runs outside the server, especially in clustered environments. Oracle Oracle Database instance diagram. The instance handles memory and processes. Oracle An Oracle Database is a multi-tenant container database (CDB) with multiple user-created pluggable databases (PDB). What’s new in Oracle Database 21c According to Oracle, the 200-plus new features in Oracle Database 21c fall into six categories: application development; big data and data warehousing solutions; database upgrade and utilities; management solutions; performance and high availability options; and security solutions. Reading through all the improvements, it is clear to me that Oracle Database 21c is a mature database being improved by smart people in multiple ways to make it work better, faster, and more reliably, especially in the cloud. There are way too many new features to cover them all. I’ll pick out a few of the high-level points, mostly following William Hardie’s overview blog post. Blockchain tables While I’m something of a blockchain skeptic, Oracle’s addition of a blockchain ledger to its converged database will probably be useful for financial institutions. Basically, a blockchain is an immutable table that chains rows together cryptographically. The Oracle implementation uses an SHA2-512 computation over the current and previous rows as a hash. There are a few deviations from immutability in the implementation, however. For one, rows can be set to NO DELETE UNTIL n DAYS AFTER INSERT, so that, for example, the blockchain table can conform to GDPR’s “right to delete” requirement when used for an external user table, after the retention period has expired. Blockchains are normally supposed to be distributed; an Oracle database is typically centralized. Oracle does have a separate distributed Blockchain Platform. Oracle suggests using the integrated blockchain table in Oracle Database 21c when you don’t really need a distributed ledger, for example when the blockchain is used internally by an enterprise for financials or compliance data. Native JSON type Oracle Database has had JSON support since version 12c, but only within other data types (VARCHAR2, CLOB, or BLOB). Now it also has a native JSON type, which is stored in binary format, similar to the JSONB used by MongoDB. As a native type, JSON parsing happens only on inserts. This can result in read and update operations being four or five times faster than text-based JSON, and updates to very large JSON documents being 20 to 30 times faster. In-database JavaScript execution While the Oracle Database has supported running programming languages within the server for decades (PL/SQL since 1992, Java since 1999, and so on), Oracle Database 21c adds support for JavaScript via the Oracle Multilingual Engine (MLE), powered by GraalVM. The MLE automatically maps JavaScript data types to Oracle Database data types and vice versa. The JavaScript code itself can execute PL/SQL (stored procedures) and SQL through a built-in JavaScript module. SQL macros Oracle developers have long used PL/SQL stored procedures and table functions to simplify SQL queries. Unfortunately, the Oracle PL/SQL and SQL engines operate in different contexts, and calling PL/SQL from SQL causes an expensive context switch. To avoid this, Oracle Database 21c adds a SQL macro capability with the new SQL_MACRO keyword, which causes functions to run in the SQL engine and avoid the context switch. In-memory database enhancements Database In-Memory used to require considerable management from users or DBAs. Now you can set INMEMORY_AUTOMATIC_LEVEL to HIGH, and all columns will be considered for in-memory analysis. In addition, in-memory hash joins have been optimized using SIMD vectorization, resulting in up to 10x speedups. Also, columnar scans have been sped up for the case where not every column is in memory. When that happens, the optimizer can now perform a hybrid scan and fetch projected column values from the row store if needed, instead of needing to perform the scan entirely on the row store. Optimized graph models Graph databases can have millions or even billions of edges and vertices. In Oracle Database 21c, the graph representation has been optimized, allowing the analysis of larger graphs using less memory than previous versions. This version also allows developers to create or extend graph algorithms in Java. These user-defined graph algorithms can run as quickly as native algorithms. Persistent memory (PMEM) support Oracle Database 21c added support for a Persistent Memory (PMEM) store on single instance databases. This allows for storage of data and redo in local PMEM on commodity hardware. Oracle Database 21c runs SQL directly on data stored in the mapped PMEM file system, eliminating the need for a large buffer cache and reduces the code path. And it provides much faster transaction durability and near-instant recovery. This feature is especially useful for workloads that tend to be I/O bound. Oracle Machine Learning While Oracle performed in-database machine learning in previous database versions, Oracle Database 21c offers a new AutoML option and new machine learning algorithms. Oracle AutoML does automatic algorithm selection, automatic feature engineering, and automatic model tuning. New machine learning algorithms include XGBoost, MSET-SPRT, and the Adam optimizer. XGBoost is a highly efficient, scalable gradient tree boosting machine learning algorithm for regression and classification. The Multivariate State Estimation Technique–Sequential Probability Ratio Test (MSET-SPRT) algorithm is a nonlinear, nonparametric anomaly detection technique for monitoring critical processes. And Adam is a popular extension to stochastic gradient descent that uses mini-batch optimization and can make progress faster while seeing less data than the other supported neural network optimization solver, Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) with line search. Running Oracle Database 21c in Oracle Cloud Oracle Database 21c currently has limited availability. It is currently only available in Always Free Autonomous Database, and in Database Cloud Service. It will be available more broadly later this year. Always Free Autonomous Database is simple to create, and as the description says, always free, but it’s limited to one core and 20 GB of storage. (See the screenshot below.) IDG You can create autonomous databases in the Oracle Cloud with a self-service interface. Currently Oracle Database 21c is available only for small Always Free instances, and in the Database Cloud Service. Database Cloud Service takes more effort to create (see the screenshots below), but offers you a choice of “shapes” — meaning the configuration of the VM. To perform the testing described in the next section, two cores are the minimum and four cores are recommended. Before creating the database VM, I had to create a virtual cloud network (VCN) and request a higher limit on cores and RAM than came with my account. Requests for Oracle Cloud service limit increases take a business day, or sometimes more. IDG This is the first of the forms to create an Oracle Database system in Oracle Cloud. Note the choice of the logical volume manager rather than Oracle Grid Infrastructure; this is required for the labs to test new features. Also note the choice of four cores on one node, and 256 GB of storage. IDG This is the second part of the first form to create an Oracle Database system in Oracle Cloud. The SSH key partially shown is the public key of a pair which I pasted from my Mac. Later on I used the private key of the pair to authorize a connection to the cloud VM. IDG We’re looking at the form for selecting a database software image. I needed Oracle Database version 21c to test the new features. IDG This is the second page of the form to create an Oracle Database system. Aside from naming the container and pluggable databases and setting the administrator password, I selected a workload optimized for transaction processing. IDG After about 10 minutes of provisioning, my new Oracle Database 21c cloud database was ready for use. As you can see, Oracle Cloud databases can be scaled up or down at need after creation. The amount of storage is independent of the number of cores (the shape). It’s not obvious from this screen, but scaling the database does not require shutting it down. Once the database was ready, I was able to connect to the VM and the database service from a shell on my own machine. IDG Here I’m using SSH and a private key to connect to my cloud VM, and SQL*Plus to connect from there to the database. SQL*Plus is Oracle’s command-line database client. Then I went through the two setup labs and many of the example labs, taking most of a day. While I fumble-fingered a few things and found a few messages alarming, I was always able to recover, and there were no serious problems with the labs. A day of running the database cost $4.43. IDG The Database 21c New Features Workshop currently allows you to test features from a command line. I wish that I could have tested and benchmarked some of the bigger features, such as graph algorithms and JSON document performance, but Oracle isn’t ready to benchmark 21c, and they don’t usually publish benchmarks that were run in the cloud. The published performance numbers — 19 microseconds latency, 1 TB/second analytic scans, and 560 GB/second SQL throughput — were obtained in 2019 by benchmarking Oracle Database 19c running on Exadata X8M on-prem. RDBMS and kitchen sink Oracle Database 21c is the newest in a long line of relational databases going back to 1979. It is now a converged database, meaning that it supports relational, time series, graph, spatial, text, OLAP, XML, and JSON data all at once in the same database, and supports transactional, analytic, machine learning, IoT, streaming, and blockchain workloads. Oracle Database relies on Exadata appliances for its performance and many of its capabilities, such as automatic index creation and automatic database tuning, which are bundled into the Autonomous Database feature. Oracle Database tries to compete with every other database in the world. Historically its on-premises prices were considered high by many customers, but moving from one database to another is non-trivial, and it sometimes took large enterprises multiple years to complete their migrations and end their Oracle contracts. Oracle’s cloud pricing now feels roughly competitive with high-performance commercial cloud database services such as SQL Azure and Amazon Aurora, but direct comparisons of pricing and performance among databases are difficult to make. I’d suggest trying an always-free Oracle Autonomous Database cloud instance for basic functionality evaluations, and a larger Database Cloud Service instance to evaluate performance with significant amounts of data. Getting started with Oracle Database takes time, even for experienced DBAs and database programmers. I’d plan on taking at least a week to do an evaluation. — Cost: A few Oracle cloud services are free, and you can get a $300 credit when you open an account. After that, costs depend on resource usage. You can estimate your Oracle cloud costs online. Running an Exadata X8M rack on-premises starts at ~$100K plus Oracle Database user licenses; Oracle Cloud@Customer can change finances to usage-based billing. Platform: Currently, Oracle Database 21c is only available in the Oracle Cloud. Later in 2021, it will also be available on-premises on various platforms, including Exadata hardware. Related content opinion The dirty little secret of open source contributions It isn’t the person making the contributions—it’s how easy the contributions make it to use the software. By Matt Asay Nov 18, 2024 4 mins Technology Industry Open Source opinion Breaking down digital silos Want your digital initiatives to succeed? Put the business and tech teams in the same sandbox and let them work together. By Matt Asay Nov 11, 2024 4 mins Chief Digital Officer CIO Technology Industry opinion The cloud reaches its equilibrium point Workloads in the cloud and workloads on premises are both reasonable choices. Enterprises are about evenly split between the two. By Matt Asay Nov 04, 2024 5 mins Technology Industry Cloud Architecture Cloud Computing analysis Overlooked cloud sustainability issues Unsustainable demands for AI are on the horizon. People will soon have to stop fake-caring about cloud sustainability and give a damn for real. By David Linthicum Nov 01, 2024 5 mins Technology Industry Cloud Architecture Cloud Computing Resources Videos