How to manage generative AI – data, costs, and scaling up

From managing data to scaling systems to funding initiatives for the long haul, every part of your generative journey will be a challenge.

Generative AI is estimated to add between $2.6 trillion to $4.4 trillion in economic benefits to the global economy annually, according to McKinsey. This forecast is based on 63 new use cases that could deliver improvements, efficiencies, and new products for customers across multiple markets. This is a huge opportunity for developers and IT leaders alike.

At the core of the generative AI promise is data. Data enables generative AI to understand, analyze, and interact with the world around us, fueling its transformative capabilities. To succeed with generative AI, your company will need to manage and prepare its data well.

At the same time, you will need to lay the groundwork for building and operating AI services at scale, and you will need to fund your generative AI initiative in a smart and sustainable way. Starting slow and tapering off is no way to win the AI race.

If we don’t improve how we manage data, or approach scaling and costs in the right way, then the potential inherent in generative AI will be lost. Here are some thoughts on how we can we improve our data management approaches, and how we can support our generative AI initiatives for the long run.

Where the data comes from

Data comes in various forms. Each form of data can improve the richness and quality of generative AI insights if it is used correctly.

The first form of data is structured data, which is put together in a regimented and consistent way. Structured data would include items like product information, customer demographics, or stock levels. This kind of data provides a foundation of organized facts that can be added to generative AI projects to enhance the quality of responses.

Alongside this, you may have external data sources that can complement your internal structured data sources. Common examples here would include weather reports, stock prices, or traffic levels—data that can bring more real-time and real-world context to a decision-making process. This data can be blended into your projects to provide additional quality data, but it may not make sense to generate it yourself.

Another common data set is derived data, which covers data created through analysis and modelling scenarios. These deeper insights can include customer intent reports, seasonal sales predictions, or cohort analysis.

The last common form of data is unstructured data. Rather than the regular reports or data formats that analysts are used to, this category includes formats like images, documents, and audio files. These data capture the nuances of human communication and expression. Generative AI programs often work around images or audio, which are common inputs and outputs of generative AI models.

Making generative AI work at scale

All of these diverse sets of data will exist in their own environments. At the same time, making them useful for generative AI projects involves making this diverse data landscape accessible in real time. With so much potential data involved, any approach must both scale dynamically on demand and replicate data globally so that any resources are close to users when requests come in. This is necessary to prevent downtime and reduce latency within transaction requests.

This data also has to be prepared so that the generative AI system can use it effectively. This involves creating embeddings, which are mathematical values, i.e., vectors, that represent semantic meaning. Embeddings enable the generative AI system to search beyond specific text matches and instead encompass the meaning and context embedded within data. Whatever the original form of the data, creating embeddings means that the data can be understood and used by the generative AI system and retain its meaning and context.

Using these embeddings, companies can support vector search or hybrid search across all their data, combining value and meaning at the same time. These results can then be gathered and passed back to the large language model (LLM) used to assemble the result. By making more data available from multiple sources, rather than relying on the LLM alone, your generative AI project can deliver better results back to the user and reduce hallucinations.

To make this work in practice, you have to choose the right underlying data fabric. As part of this, you will want to avoid a fragmented patchwork of data held in different solutions as much as possible, as each one of these represents another silo that has to be supported, interrogated, and managed over time. Users should be able to ask the LLM a question and receive a response quickly, rather than waiting for multiple components to respond and the model to weigh up their responses. A unified data fabric should deliver seamless data integration, enabling generative AI to tap into the full spectrum of data available.

The benefits of a modular approach

To scale up your generative AI implementation, you will have to balance how fast you can grow adoption against maintaining control over your critical assets. Adopting a modular approach to building your generative AI agents makes this easier as you can break down your implementation and avoid potential bottlenecks.

Similar to microservices designs for applications, a modular approach to AI services also encourages best practices around application and software design to remove points of failure, as well as opening up access to the technology to more potential users. It also makes it easier to monitor agent performance across the enterprise and spot more precisely where problems occur.

The first benefit of modularity is explainability. As components involved in the generative AI system are separated from each other, this makes it easier to analyse how agents function and make decisions. AI is often described as a “black box.” Compartmentalization makes tracking and explaining results much easier.

The second benefit here is security, as components can be protected by best-in-class authentication and authorization mechanisms, ensuring that only authorized users have access to sensitive data and functionality. Modularity also makes compliance and governance easier, as personally identifiable information (PII) or intellectual property (IP) can be safeguarded and kept separate from the underlying LLM.

Funding your generative AI initiative

Alongside the microservices approach, you should adopt a platform mindset for your overall generative AI program. This involves replacing the traditional project-based model funding model for software projects and providing a consistent and flexible funding model instead. This approach empowers participants to make value-based decisions, respond to emerging opportunities, and develop best practices without being constrained by rigid funding cycles or business cases.

Treating your budget in this way also encourages developers and business teams to consider generative AI as part of the overall infrastructure that the organization has in place. This makes it easier to avoid some of the peaks and troughs that can otherwise affect workload planning, and makes it easier to take a “center of excellence” approach that remains consistent over time.

A similar approach is to treat generative AI as a product that the business operates in its own right, rather than as software. AI agents should be managed as products because this represents the value that they create more effectively, as well as making it easier to get support resources around integration, tools, and prompts. Simplifying this model encourages a more widespread understanding around generative AI and the adoption of best practices across the organization, fostering a culture of shared expertise and collaboration in generative AI development.

Generative AI has huge potential, and companies are rushing to implement new tools, agents, and prompts in their operations. However, getting these potential projects into production involves managing your data effectively, laying a foundation for scaling up systems, and getting the right budget model in place to support your team. Getting your processes and priorities right will help you and your team unlock the transformative potential of this technology.

Dom Couldwell is head of field engineering, EMEA, at DataStax.

—

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

Topics

About

Policies

Our Network

More

How to manage generative AI – data, costs, and scaling up

From managing data to scaling systems to funding initiatives for the long haul, every part of your generative journey will be a challenge.

Where the data comes from

Making generative AI work at scale

The benefits of a modular approach

Funding your generative AI initiative

Show me more

Microsoft extends Entra ID to WSL, WinGet

Microsoft rebrands Azure AI Studio to Azure AI Foundry

Succeeding with observability in the cloud

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx

How to manage generative AI – data, costs, and scaling up

From managing data to scaling systems to funding initiatives for the long haul, every part of your generative journey will be a challenge.

Where the data comes from

Making generative AI work at scale

The benefits of a modular approach

Funding your generative AI initiative

Related content

What is Rust? Safe, fast, and easy software development

Kotlin for Java developers: Classes and coroutines

Azure AI Foundry tools for changes in AI applications

Microsoft unveils imaging APIs for Windows Copilot Runtime

Show me more

Microsoft extends Entra ID to WSL, WinGet

Microsoft rebrands Azure AI Studio to Azure AI Foundry

Succeeding with observability in the cloud

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx