by Maksim Krivobok

How AI will transform data analytics

feature
Jul 30, 20247 mins
Artificial IntelligenceGenerative AISoftware Development

Just as AI-powered programming assistants make developers more productive, AI will streamline workflows for data analysts. It will also bring vast benefits to business users.

woman contemplating analytics
Credit: Shutterstock

Software developers are already benefiting from generative AI, enjoying the ability of AI-powered programming assistants to streamline time-consuming tasks, learn new languages and frameworks, and boost productivity. Now, the data analytics arena is also starting to experience the efficiencies of AI seen by developers. The implementation of large language models (LLM) on data analytics platforms has the potential to significantly enhance the capabilities of data analysts. As has been the case for coders, routine tasks (including code generation and SQL generation, as well as chart generation) will be simplified and accelerated.

Within data analytics, AI goes further than removing repetitive work done by analysts, even simplifying the entry into data analysis and expanding the range of users. When AI enables reports and analytics to be done by business customers themselves, it complements the work of data science teams, allowing them to focus on more complex and strategic tasks. This streamlines and quickens the work processes of the whole company.

The data analyst role reshaped

AI can help analysts move towards spending more time doing what businesses want from them: pulling insights from data. AI will free them from the drain of tedious data wrangling tasks and assist with data preparation by generating code, guiding thinking, and preparing reports so that analysts can narrow their focus to what really matters. With the help of AI, data analysts will be able to save time on technical work and immerse themselves in the business side of things.

Let’s take an example. Analysts can spend hours writing code and searching for the documentation for specific libraries. However, AI will accelerate both of these tasks, allowing them to start with an AI-generated proposal—and then go on to dig deeper into the code if they need (or want) to do so. Just as AI-powered programming assistants are allowing developers to complete tasks more quickly, they will streamline workflows for the data analyst. Ultimately, the data analyst role itself will be reformulated. They will spend more time focusing on business needs—concentrating on formulating the proper goals for the research, and prompting — and less time on the mechanics of project preparation.

We shouldn’t get too ahead of ourselves, though. Coding skills are useful for data analysts and won’t be dispensed with. Yet, for simple cases, the potential of AI is already clear, with the technology able to generate required code and summarize findings. And, with time, we can expect the need for coding to decrease further. Instead, the focus of data analysts will shift towards the business domain. The task will be formulated for AI to complete, but guided with careful attention to the needs of the business.

Changes to the data analyst’s toolkit

The reach of AI will also extend to the tools used by most data analysts for their work, such as Jupyter Notebooks and compatible solutions. The notebook format won’t disappear completely. Rather, in the future, notebooks will become more prompt-oriented, while still retaining code cells and various outputs. Moreover, you can imagine that your notebook will be akin to ChatGPT insofar as it will already have access to your data and other relevant tools. This change alone will open data analytics up to many more people.

I believe that data analytics platforms will also play an important part in the work data analysts do in sharing their results—a key part of a data analyst’s job. It’s one thing to find something in the data; it’s a whole different thing to convince others that it means what you think it does. By drawing on information about the target audience, and goals of the research, data analytics platforms will be able to generate reports to be used in this sharing. Moreover, AI-powered tools will be able to arrange cells on the canvas, also based on the target audience of the report, as well as on the purpose of the analysis. Yet, ultimately, the whole process of gaining those insights will be interactive — yes, the process will deeply involve an AI assistant during certain stages and the AI will be able to solve some tasks autonomously. However, for those tasks which are more complex, human input will still be required.

The crucial importance of the semantic layer

As the use of large language models becomes more common in data analysis, the semantic layer (or metrics layer) in organizations will become indispensable. The presence of the semantic layer accelerates the shift to self-service analytics and enables seamless integration of AI into analytics tools.

The semantic layer plays a crucial role in translating raw data into meaningful business insights. It connects business terms to the underlying data, ensuring consistent definitions across the organization. In companies where multiple data storage systems and various querying and visualization tools are used, the semantic layer provides a unified framework. This consistency is vital for teams in finance, marketing, IT, and other departments, which often store data on different platforms and use diverse tools.

By offering a unified view of an organization’s data, the semantic layer simplifies the data in common business terms. It acts as a translator between raw data and business applications, giving business context to the data. By modeling the organization’s data with clearly defined values and dimensions, higher-level concepts like KPIs can be consistently and accurately defined and calculated. This ensures that metrics and dimensions, once established, are uniformly applied. For instance, any report or dashboard referencing “total revenue by month” will always use the same definition.

The semantic layer bridges the gap between raw data and business insights, ensuring the consistent interpretation and reporting of data across an organization. As organizations increasingly rely on data-driven insights and metrics, the importance of the semantic layer in data analytics and decision-making will continue to grow. It will become a cornerstone of future analytical tools and indeed of the data landscape more broadly.

The rise of AI-driven analytics

Just as AI answers questions about code for developers, AI will be able to answer questions about reports for both data analysts and business users. Although data analysts will still join in at this stage if the technology can’t handle it, AI is poised to become even better in responding to questions. With time, AI will ingest more and more of a company’s data siloes—including data from CRM systems, support-ticket systems, and ERP systems. Data analytics platforms will also develop functionalities that allow company knowledge bases to be used, including information about its clients and metrics, along with information drawn from external sources (like stock exchange data, news feeds, and market analysis). Bolstered by amassing vast amounts of data, AI-powered data analytics platforms will further bridge the gap between data and business teams and allow them to collaborate much more efficiently.

Ultimately, though, the outcomes of AI processes are a human responsibility. Where generative AI outputs impact real-world decisions, it must always be possible to explain the results. This is where the notebook format comes in. The steps making up the analysis sequence won’t disappear, but will just become both quicker and more automated (and still hidden on a deeper level from most people). With the sequence of steps that led a data analyst to a particular conclusion remaining understandable and traceable, the vast benefits of AI to data analytics can be enjoyed.

Maksim Krivobok is team lead for Datalore, a collaborative data science platform, at JetBrains.

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.