Your generative AI project is going to fail

Fueled by vibes and with stars in their eyes, enterprises are not taking the time to understand generative AI’s limitations and to create their own rules-based approach.

stress / failure / frustration / iteration / unhappy persistence

Your generative AI project is almost certainly going to fail. But take heart: You probably shouldn’t have been using AI to solve your business problem, anyway. This seems to be an accepted fact among the data science crowd, but that wisdom has been slow to reach enterprise executives. For example, data scientist Noah Lorang once suggested, “There is a very small subset of business problems that are best solved by machine learning; most of them just need good data and an understanding of what it means.”

And yet 87% of companies surveyed by Bain & Company said they’re developing generative AI applications. For some, that’s the exactly right approach. For many others, it’s not.

We have collectively gotten so far ahead of ourselves with generative AI that we’re setting ourselves up for failure. That failure comes from a variety of sources, including data governance or data quality issues, but the primary problem right now is expectations. People dabble with ChatGPT for an afternoon and expect it to be able to resolve their supply chain issues or customer support questions. It won’t. But AI isn’t the problem, we are.

‘Expectations set purely based on vibes’

Shreya Shankar, a machine learning engineer at Viaduct, argues that one of the blessings and curses of genAI is that it seemingly eliminates the need for data preparation, which has long been one of the hardest aspects of machine learning. “Because you’ve put in such little effort into data preparation, it’s very easy to get pleasantly surprised by initial results,” she says, which then “propels the next stage of experimentation, also known as prompt engineering.”

Rather than do the hard, dirty work of data preparation, with all the testing and retraining to get a model to yield even remotely useful results, people are jumping straight to dessert, as it were. This, in turn, leads to unrealistic expectations: “Generative AI and LLMs are a little more interesting in that most people don’t have any form of systematic evaluation before they ship (why would they be forced to, if they didn’t collect a training dataset?), so their expectations are set purely based on vibes,” Shankar says.

Vibes, as it turns out, are not a good data set for successful AI applications.

The real key to machine learning success is something that is mostly missing from generative AI: the constant tuning of the model. “In ML and AI engineering,” Shankar writes, “teams often expect too high of accuracy or alignment with their expectations from an AI application right after it’s launched, and often don’t build out the infrastructure to continually inspect data, incorporate new tests, and improve the end-to-end system.” It’s all the work that happens before and after the prompt, in other words, that delivers success. For generative AI applications, partly because of how fast it is to get started, much of this discipline is lost.

Things also get more complicated with generative AI because there is no consistency between prompt and response. I love the way Amol Ajgaonkar, CTO of product innovation at Insight, put it. Sometimes we think our interactions with LLMs are like having a mature conversation with an adult. It’s not, he says, but rather, “It’s like giving my teenage kids instructions. Sometimes you have to repeat yourself so it sticks.” Making it more complicated, “Sometimes the AI listens, and other times it won’t follow instructions. It’s almost like a different language.”

Learning how to converse with generative AI systems is both art and science and requires considerable experience to do it well. Unfortunately, many gain too much confidence from their casual experiments with ChatGPT and set expectations much higher than the tools can deliver, leading to disappointing failure.

Put down the shiny new toy

Many are sprinting into generative AI without first considering whether there are simpler, better ways of accomplishing their goals. Santiago Valdarrama, founder of Tideily, recommends starting with simple heuristics, or rules. He offers two advantages to this approach: “First, you’ll learn much more about the problem you need to solve. Second, you’ll have a baseline to compare against any future machine-learning solution.”

As with software development, where the hardest work isn’t coding but rather figuring out which code to write, the hardest thing in AI is figuring out how or if to apply AI. When simple rules need to yield to more complicated rules, Valdarrama suggests switching to a simple model. Note the continued stress on “simple.” As he says, “simplicity always wins” and should dictate decisions until more complicated models are absolutely necessary.

So, back to generative AI. Yes, genAI might be what your business needs to deliver customer value in a given scenario. Maybe. It’s more likely that solid analysis and rules-based approaches will give the desired yields. For those who are determined to use the shiny new thing, well, even then it’s still best to start small and simple and learn how to use generative AI successfully.

Your generative AI project is going to fail

Fueled by vibes and with stars in their eyes, enterprises are not taking the time to understand generative AI’s limitations and to create their own rules-based approach.

‘Expectations set purely based on vibes’

Put down the shiny new toy

More from this author

The dirty little secret of open source contributions

Breaking down digital silos

The cloud reaches its equilibrium point

Open source gets complicated

Stopping the rot in AI spending

Making generative AI work for you

Open source isn’t going to save AI

Crescendo makes AI boring—and profitable

Show me more

What is Rust? Safe, fast, and easy software development

Kotlin for Java developers: Classes and coroutines

Azure AI Foundry tools for changes in AI applications

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx