Data-driven decision-making suffers from a mismatch between the tools, skills, and understanding of IT and data consumers in most enterprises. Here’s how to bridge the gap. Credit: Jay Yuno / Getty Images Data transformations are a fundamental step in turning raw data into information that is consumable by the business. As businesses collect data from an ever-growing number of sources, in a multitude of different formats, all that data needs to be transformed in order to become valuable to the organization. The data transformation process remains largely siloed within the IT department in most enterprises. Data transformations can be time-consuming, especially when IT is bogged down with multiple projects and requests from the data consumers. With centralized IT teams serving the data needs of every part of their organizations—from HR to finance—they can become an inadvertent bottleneck, damaging the time to value of the data. IT teams may also not fully understand the needs of each distinct department for which they are collecting and transforming data. In an effort to get answers more quickly, departments sometimes end up building their own rogue data pipelines or data transformation processes, which may disrupt governance policies or impact data quality. This way of managing data transformation highlights a crucial gap in business operations: IT handles the data but doesn’t fully understand its business applications. The departments that need the data don’t grasp the technical processes required to produce consistent high-quality insights. Businesses aiming to scale and truly capitalize on their data are now grappling with the challenge of merging systems, processes, and cross-departmental knowledge to create a transparent and collaborative environment for data across the entire organization. Making data initiatives transparent throughout the organization Organizations often struggle with data initiatives that are the result of departmental silos. One way to bridge departmental data silos is to make data projects visible across the organization through the transformation layer (while adhering to enterprise-wide data governance standards). In this context, a data project is a logical grouping of data pipelines, each with its own user permissions, Git repository, development workspaces, and deployable environments, so each team can access the relevant data for their initiatives without compromising data quality or governance. For all of this to happen through the transformation layer, all data must be stored and transformed on the same platform. Distinct teams are given specific views of the data warehouse relevant to their work, allowing work to be easily monitored for security concerns, while still provisioning users with the information they need to make informed decisions. But that still doesn’t overcome the fact that technical and non-technical users use different tools or platforms that are tailored to their skill level. Tools and platforms used by IT tend to have a steep learning curve and require engineering skill; they are hard or impossible to use by those who lack the technical knowledge. Tools used by non-technical business users across departments are often deemed too inflexible by the engineering teams in IT. It’s easy to see how data processes can quickly become and remain siloed within their respective department. Enter visualization. Visualizing data processes to improve accessibility Visualization of data transformations and data lineage—where the data came from and how it is used—can help all users, regardless of skill, make sense of complex information to identify changes, extract information, and quickly make decisions rooted in the facts. Since every business function benefits from understanding its data, visualizations should be a prominent feature of every data project and platform. There’s no lack of tools and platforms that have attempted to democratize data through visualizing data for business intelligence, but none have succeeded to do that at the data transformation level. For example, while the ability to pull data from a table and turn it into a chart can surface trends and insight, it still doesn’t offer a complete picture of the data: where it came from, what changes it may have undergone on the way, where else it’s being used. The inability to access data lineage information—in the easy-to-understand way that visualizing the transformation layer would enable—is one of the main reasons, in our opinion, that lower-quality foundational data has found its way downstream to business users and has led to poor business decisions and poor business outcomes. The proliferation of tools in the data ecosystem has made holistic access to data, and holistic understanding of data, even more difficult to achieve. Visualizing data transformations would communicate both the “what” and the “why” of critical KPIs while offering opportunities for users to explore and monitor changes and patterns. It would empower the less technically savvy with usable information, rapidly get everyone on the same page, boost productivity, and accelerate time to value. Enabling collaboration on data projects across departments Lastly, today’s tooling must provide data architects and engineers with full extensibility while still making it easy for junior data practitioners to be productive and hone their skills. One way to streamline the management of data initiatives while ensuring data is properly governed is by essentially threading metadata through data projects that have previously been siloed within departments. This model would enable the changes in departmental work to be reflected in real time across the entire organization. Combined with a visual interface that makes data lineage accessible to all users within an organization, this model would give users essential context and clarity on how data is being defined and used, and how it evolves over time. Bridging the gap between usability and technical prowess—and granting access to users of all skill levels—is the foundation for empowering collaboration across data teams, projects, and departments. Armon Petrossian is CEO and co-founder of Coalesce. — New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com. Related content feature Dataframes explained: The modern in-memory data science format Dataframes are a staple element of data science libraries and frameworks. Here's why many developers prefer them for working with in-memory data. By Serdar Yegulalp Nov 06, 2024 6 mins Data Science Data Management analysis Cloud providers make bank with genAI while projects fail Generative AI is causing excitement but not success for most enterprises. This needs to change quickly, but it will take some work that enterprises may not be willing to do. By David Linthicum Nov 05, 2024 5 mins Generative AI Cloud Computing Data Management feature Overcoming data inconsistency with a universal semantic layer Disparate BI, analytics, and data science tools result in discrepancies in data interpretation, business logic, and definitions among user groups. A universal semantic layer resolves those discrepancies. By Artyom Keydunov Nov 01, 2024 7 mins Business Intelligence Data Management feature Bridging the performance gap in data infrastructure for AI A significant chasm exists between most organizations’ current data infrastructure capabilities and those necessary to effectively support AI workloads. By Colleen Tartow Oct 28, 2024 12 mins Generative AI Data Architecture Artificial Intelligence Resources Videos