The updates in Delta Lake 3.0 include a new universal table format, dubbed UniForm, a Delta Kernel, and liquid clustering to improve data read and write performance. Credit: Getty Images Databricks on Wednesday introduced a new version of its data lakehouse offering, dubbed Delta Lake 3.0, in order to take on the rising popularity of Apache Iceberg tables used by rival Snowflake. As part of Delta Lake 3.0, the company has introduced a new universal table format, dubbed UniForm, that will allow enterprises to use the data lakehouse with other table formats such as Apache Iceberg and Apache Hudi, the company said. A data lakehouse is a data architecture that offers both storage and analytics capabilities, in contrast to the concepts for data lakes, which store data in native format, and data warehouses, which store structured data (often in SQL format). UniForm eliminates the need for manually converting files from different data lakes and data warehouses while conducting analytics or building AI models, Databricks said. The new table format, according to analysts, is Databricks’ strategy to connect its data lakehouse with the rest of the world and take on rival Snowflake, especially on the backdrop of Apache Iceberg garnering more multivendor support in the past few years. “With UniForm, Databricks is essentially saying, if you can’t beat them, join them,” said Tony Baer, principal analyst at dbInsight, likening the battle between the table formats to the one between Apple’s iOS and Google’s Android operating system. However, Baer believes that the adoption of lakehouses will depend on the ecosystem they provide and not just table formats. “Adoption of data lakehouses is still very preliminary as the ecosystems have only recently crystallized, and most enterprises are still learning what lakehouses are,” Baer said, adding that lakehouses may see meaningful adoption a year from now. Contrary to Baer, Databricks said its Delta Lake has seen nearly one billion downloads in a year. Last year, the company open sourced its Delta Lake offering and this according to the company has seen the lakehouse get updates from contributing engineers from AWS, Adobe, Twilio, eBay, and Uber. Delta Kernel and liquid clustering As part of Delta Lake 3.0, the company has also introduced two other features — Delta Kernel and a liquid clustering feature. According to Databricks, Delta Kernel addresses connector fragmentation by ensuring that all connectors are built using a core Delta library that implements Delta specifications. This alleviates the need for enterprise users to update Delta connectors with each new version or protocol change, the company said. Delta Kernel, according to SanjMo principal analyst Sanjeev Mohan, is like a connector development kit that abstracts many of the underlying details and instead provides a set of stable APIs. “This reduces the complexity and time to build and deploy connectors. We expect that the system integrators will now be able to accelerate development and deployment of connectors, in turn further expanding Databricks’ partner ecosystem,” Mohan said. Liquid clustering has been introduced to address performance issues around data read and write operations, Databricks said. In contrast to traditional methods such as Hive-style partitioning that increases data management complexity due to its use of a fixed data layout to improve read and write performance, liquid clustering offers a flexible data layout format that Databricks claims will provide cost-efficient clustering as data increases in size. Related content feature Dataframes explained: The modern in-memory data science format Dataframes are a staple element of data science libraries and frameworks. Here's why many developers prefer them for working with in-memory data. By Serdar Yegulalp Nov 06, 2024 6 mins Data Science Data Management analysis Cloud providers make bank with genAI while projects fail Generative AI is causing excitement but not success for most enterprises. This needs to change quickly, but it will take some work that enterprises may not be willing to do. By David Linthicum Nov 05, 2024 5 mins Generative AI Cloud Computing Data Management feature Overcoming data inconsistency with a universal semantic layer Disparate BI, analytics, and data science tools result in discrepancies in data interpretation, business logic, and definitions among user groups. A universal semantic layer resolves those discrepancies. By Artyom Keydunov Nov 01, 2024 7 mins Business Intelligence Data Management feature Bridging the performance gap in data infrastructure for AI A significant chasm exists between most organizations’ current data infrastructure capabilities and those necessary to effectively support AI workloads. By Colleen Tartow Oct 28, 2024 12 mins Generative AI Data Architecture Artificial Intelligence Resources Videos