The new features include the ability to copy data and roll back changes in Apache Iceberg tables. Dremio is adding new features to its data lakehouse including the ability to copy data into Apache Iceberg tables and roll back changes made to these tables. Apache Iceberg is an open-source table format used by Dremio to store analytic data sets. In order to copy data into Iceberg tables, enterprises and developers have to use the new “copy into SQL” command, the company said. “With one command, customers can now copy data from CSV and JSON file formats stored in Amazon S3, Azure Data Lake Storage (ADLS), HDFS, and other supported data sources into Apache Iceberg tables using the columnar Parquet file format for performance,” Dremio said in an announcement Wednesday. The copy operation is distributed across the entire, underlying lake house engine to load more data quickly, it added. The company has also introduced a table rollback feature for enterprises, akin to a Windows system restore backup or a Mac Time Machine backup. The tables can be backed up either to a specific time or a snapshot ID, the company said, adding that developers will have to make use of the “rollback” command to access the feature. “The rollback feature makes it easy to revert a table back to a previous state with a single command. When rolling back a table, Dremio will create a new Apache Iceberg snapshot from the prior state and use it as the new current table state,” Dremio said. Optimize command boosts Iceberg performance In an effort to increase the performance of Iceberg tables, Dremio has introduced the “optimize” command to consolidate and optimize sizes of small files that are created when data manipulation commands such as insert, update, or delete are used. “Often, customers will have many small files as a result of DML operations, which can impact read and write performance on that table and utilize excess storage,” the company said, adding that the “optimize” command can be used inside Dremio Sonar at regular intervals to maintain performance. Dremio Sonar is a SQL engine that provides data warehousing capabilities to the company’s lakehouse. The new features are expected to improve productivity of data engineers and system administrators while bringing utility to these class of users, said Doug Henschen, principal analyst at Constellation Research. Dremio, which was an early proponent of Apache Iceberg tables in lakehouses, competes with the likes of Ahana and Starburst, both of which introduced support for Iceberg in 2021. Other vendors such as Snowflake and Cloudera added support for Iceberg in 2022. Dremio features new database, BI connectors In addition to the new features, Dremio said that it was launching new connectors for Microsoft PowerBI, Snowflake and IBM Db2. “Customers using Dremio and PowerBI can now use single sign-on (SSO) to access their Dremio Cloud and Dremio Software engines from PowerBI, simplifying access control and user management across their data architecture,” the company said. The Snowflake and IBM DB2 connectors will allow enterprises to add Snowflake data warehouses and IBM DB2 databases as data sources for Dremio, it added. This makes it easy to include data in these systems as part of the Dremio semantic layer, enabling customers to explore this data in their Dremio queries and views. The launch of these connectors, according to Henschen, brings more plug-and-play options to analytics professionals from Dremio’s stable. Related content news SingleStore acquires BryteFlow to boost data ingestion capabilities SingleStore will integrate BryteFlow’s capabilties inside its database offering via a no-code interface named SingleConnect. By Anirban Ghoshal Oct 03, 2024 4 mins ETL Databases Data Integration feature 3 great new features in Postgres 17 Highly optimized incremental backups, expanded SQL/JSON support, and a configurable SLRU cache are three of the most impactful new features in the latest PostgreSQL release. By Tom Kincaid Sep 26, 2024 6 mins PostgreSQL Relational Databases Databases feature Why vector databases aren’t just databases Vector databases don’t just store your data. They find the most meaningful connections within it, driving insights and decisions at scale. By David Myriel Sep 23, 2024 5 mins Generative AI Databases Artificial Intelligence feature Overcoming AI hallucinations with RAG and knowledge graphs Combining knowledge graphs with retrieval-augmented generation can improve the accuracy of your generative AI application, and generally can be done using your existing database. By Dom Couldwell Sep 17, 2024 6 mins Graph Databases Generative AI Databases Resources Videos