Anirban Ghoshal
Senior Writer

SingleStoreDB joins the Apache Iceberg bandwagon

news
Jun 26, 20244 mins
Databases

The latest update includes Iceberg integration, faster vectors, high-relevance full-text search, and autoscaling.

networking abstract
Credit: iStock

Buoyed by customer demand, SingleStore, the company behind the relational database SingleStoreDB, has decided to natively integrate Apache Iceberg into its offering to help its enterprise customers make use of data stored in data lakehouses.

“With this new integration, SingleStore aims to transform the dormant data inside lakehouses into a valuable real-time asset for enterprise applications. Apache Iceberg, a popular open standard for data lakehouses, provides CIOs with cost-efficient storage and querying of large datasets,” said Dion Hinchcliffe, senior analyst at The Futurum Group.

Hinchcliffe pointed out that SingleStore’s integration includes updates that help its customers bypass the challenges that they may typically face when adopting traditional methods to make the data in Iceberg tables more immediate.

These challenges include complex, extensive ETL (extract, transform, load) workflows and compute-intensive Spark jobs.

Some of the key features of the integration are low-latency ingestion, bi-directional data flow, and real-time performance at lower costs, the company said.

Explaining how SingleStore achieves low latency across queries and updates, IDC research vice president Carl Olofson said that the company —formerly known as MemSQL — a memory-optimized and high-performance version of the relational database management system — uses memory features as a sort of cache.

“By doing so, the company can dramatically improve the speed with which Iceberg tables can be queried and updated,” Olofson explained, adding that the company might be proactively loading data from Iceberg into their internal memory-optimized format.

Before the Iceberg integration, SingleStore held data in a form or format that is optimized for rapid swapping into memory, where all data processing took place, the analyst said.

Several other database vendors, notably Databricks, have made attempts to adopt the Apache Iceberg table format due to its rising popularity with enterprises.

Earlier this month, Databricks agreed to acquire Tabular, the storage platform vendor led by the creators of Apache Iceberg, in order to promote data interoperability in lakehouses.

Another data lakehouse format — Delta Live Tables — developed by Databricks and later open sourced via The Linux Foundation, competes with Iceberg tables.

Currently, the company is working on another format that allows enterprises to use both Iceberg and Delta Live tables.

Both Olofson and Hinchcliffe pointed out that several vendors and offerings — such as Google’s BigQuery, Starburst, IBM’s Watsonx.data, SAP’s DataSphere, Teradata, Cloudera, Dremio, Presto, Hive, Impala, StarRocks, and Doris — have integrated Iceberg as an open source analytics table format for very large datasets.

The native integration of Iceberg into SingleStoreDB is currently in public preview.

Updates to search and deployment options

As part of the updates to SingleStoreDB, the company is adding new capabilities to its full-text search feature that improve relevance scoring, phonetic similarity, fuzzy matching, and keyword proximity-based ranking.

The combination of these capabilities allows enterprises to eliminate the need for additional specialty databases to build generative AI-based applications, the company explained.

Additionally, the company has introduced an autoscaling feature in public preview that allows enterprises to manage workloads or applications by scaling compute resources up or down.

It also lets users define thresholds for CPU and memory usage for autoscaling, to avoid any unnecessary consumption.

Further, the company said it is introducing a new deployment option for the database via Helios -BYOC, which is a managed version of the database via a virtual private cloud.

This offering is now available in private preview in AWS and enterprise customers can run SingleStore in their own tenants while complying with data residency and governance policies, the company said.